An Online Policy Gradient Algorithm for Markov Decision Processes with Continuous States and Actions

Yao Ma, Tingting Zhao, Kohei Hatano, Masashi Sugiyama. An Online Policy Gradient Algorithm for Markov Decision Processes with Continuous States and Actions. In Toon Calders, Floriana Esposito, Eyke Hüllermeier, Rosa Meo, editors, Machine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2014, Nancy, France, September 15-19, 2014. Proceedings, Part II. Volume 8725 of Lecture Notes in Computer Science, pages 354-369, Springer, 2014. [doi]

Authors

Yao Ma

This author has not been identified. Look up 'Yao Ma' in Google

Tingting Zhao

This author has not been identified. Look up 'Tingting Zhao' in Google

Kohei Hatano

This author has not been identified. Look up 'Kohei Hatano' in Google

Masashi Sugiyama

This author has not been identified. Look up 'Masashi Sugiyama' in Google