Mirror decent algorithm for a multi-armed bandit governed by a stationary finite state Markov chain

Alexander V. Nazin, Boris M. Miller. Mirror decent algorithm for a multi-armed bandit governed by a stationary finite state Markov chain. In European Control Conference, ECC 2013, Zurich, Switzerland, July 17-19, 2013. pages 371-375, IEEE, 2013. [doi]

Abstract

Abstract is missing.