Optimality and Approximation with Policy Gradient Methods in Markov Decision Processes

Alekh Agarwal, Sham M. Kakade, Jason D. Lee, Gaurav Mahajan. Optimality and Approximation with Policy Gradient Methods in Markov Decision Processes. In Jacob D. Abernethy, Shivani Agarwal 0001, editors, Conference on Learning Theory, COLT 2020, 9-12 July 2020, Virtual Event [Graz, Austria]. Volume 125 of Proceedings of Machine Learning Research, pages 64-66, PMLR, 2020. [doi]

Authors

Alekh Agarwal

This author has not been identified. Look up 'Alekh Agarwal' in Google

Sham M. Kakade

This author has not been identified. Look up 'Sham M. Kakade' in Google

Jason D. Lee

This author has not been identified. Look up 'Jason D. Lee' in Google

Gaurav Mahajan

This author has not been identified. Look up 'Gaurav Mahajan' in Google