Online Markov Decision Processes Under Bandit Feedback

Gergely Neu, András György, Csaba Szepesvári, András Antos. Online Markov Decision Processes Under Bandit Feedback. IEEE Trans. Automat. Contr., 59(3):676-691, 2014. [doi]

Abstract

Abstract is missing.