Bandit Problems With Infinitely Many Arms

Loading...
Thumbnail Image
Penn collection
Statistics Papers
Degree type
Discipline
Subject
bandit problems
sequential experimentation
dynamic allocation of Bernoulli processes
staying with a winner
switching with a loser
Applied Statistics
Funder
Grant number
License
Copyright date
Distributor
Related resources
Author
Berry, Donald A
Chen, Robert W
Zame, Alan
Heath, David C
Shepp, Larry A
Contributor
Abstract

We consider a bandit problem consisting of a sequence of n choices from an infinite number of Bernoulli arms, with n → ∞. The objective is to minimize the long-run failure rate. The Bernoulli parameters are independent observations from a distribution F. We first assume F to be the uniform distribution on (0, 1) and consider various extensions. In the uniform case we show that the best lower bound for the expected failure proportion is between √2/√n and 2/√n and we exhibit classes of strategies that achieve the latter.

Advisor
Date Range for Data Collection (Start Date)
Date Range for Data Collection (End Date)
Digital Object Identifier
Series name and number
Publication date
1997
Journal title
The Annals of Statistics
Volume number
Issue number
Publisher
Publisher DOI
Journal Issue
Comments
Recommended citation
Collection