On effectiveness of the Mirror Decent Algorithm for a stochastic multi-armed bandit governed by a stationary finite Markov chain

Alexander Nazin, Boris Miller. On effectiveness of the Mirror Decent Algorithm for a stochastic multi-armed bandit governed by a stationary finite Markov chain. In 2013 Australian Control Conference, Fremantle, WA, Australia, November 4-5, 2013. pages 244-250, IEEE, 2013. [doi]

Abstract

Abstract is missing.