Gergely Neu, András György, Csaba Szepesvári, András Antos. Online Markov Decision Processes Under Bandit Feedback. IEEE Trans. Automat. Contr., 59(3):676-691, 2014. [doi]
No references recorded for this publication.
No citations of this publication recorded.