An asymptotically optimal policy for finite support models in the multiarmed bandit problem

Junya Honda, Akimichi Takemura. An asymptotically optimal policy for finite support models in the multiarmed bandit problem. Machine Learning, 85(3):361-391, 2011. [doi]

Possibly Related Publications

The following publications are possibly variants of this publication: