Online Markov Decision Processes Under Bandit Feedback

Gergely Neu, András György, Csaba Szepesvári, András Antos. Online Markov Decision Processes Under Bandit Feedback. IEEE Trans. Automat. Contr., 59(3):676-691, 2014. [doi]

Possibly Related Publications

The following publications are possibly variants of this publication: