Optimistic initialization and greediness lead to polynomial time learning in factored MDPs

Istvan Szita, András Lörincz. Optimistic initialization and greediness lead to polynomial time learning in factored MDPs. In Andrea Pohoreckyj Danyluk, Léon Bottou, Michael L. Littman, editors, Proceedings of the 26th Annual International Conference on Machine Learning, ICML 2009, Montreal, Quebec, Canada, June 14-18, 2009. Volume 382 of ACM International Conference Proceeding Series, pages 126, ACM, 2009. [doi]

Abstract

Abstract is missing.