Avoiding the Price of Adaptivity: Inference in Linear Contextual Bandits via Stability
Analysis
This ArXiv paper addresses a critical challenge in contextual bandit algorithms: the \
Key Takeaways
Reference
“When stability holds, the ordinary least-squares estimator satisfies a central limit theorem, and classical Wald-type confidence intervals -- designed for i.i.d. data -- become asymptotically valid even under adaptation, \emph{without} incurring the $\\sqrt{d \\log T}$ price of adaptivity.”