Online learning in episodic Markovian decision processes by relative entropy policy search

Alexander Zimin, Gergely Neu. Online learning in episodic Markovian decision processes by relative entropy policy search. In Christopher J. C. Burges, Léon Bottou, Zoubin Ghahramani, Kilian Q. Weinberger, editors, Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013, Lake Tahoe, Nevada, United States. pages 1583-1591, 2013. [doi]

Abstract

Abstract is missing.