An Online Policy Gradient Algorithm for Markov Decision Processes with Continuous States and Actions

Yao Ma, Tingting Zhao, Kohei Hatano, Masashi Sugiyama. An Online Policy Gradient Algorithm for Markov Decision Processes with Continuous States and Actions. In Toon Calders, Floriana Esposito, Eyke Hüllermeier, Rosa Meo, editors, Machine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2014, Nancy, France, September 15-19, 2014. Proceedings, Part II. Volume 8725 of Lecture Notes in Computer Science, pages 354-369, Springer, 2014. [doi]

Abstract

Abstract is missing.