Online Markov Decision Processes Under Bandit Feedback

Gergely Neu, András György, Csaba Szepesvári, András Antos. Online Markov Decision Processes Under Bandit Feedback. IEEE Trans. Automat. Contr., 59(3):676-691, 2014. [doi]

Authors

Gergely Neu

This author has not been identified. Look up 'Gergely Neu' in Google

András György

This author has not been identified. Look up 'András György' in Google

Csaba Szepesvári

This author has not been identified. Look up 'Csaba Szepesvári' in Google

András Antos

This author has not been identified. Look up 'András Antos' in Google