Convergence of Policy Gradient for Entropy Regularized MDPs with Neural Network Approximation in the Mean-Field Regime

James-Michael Leahy, Bekzhan Kerimkulov, David Siska, Lukasz Szpruch. Convergence of Policy Gradient for Entropy Regularized MDPs with Neural Network Approximation in the Mean-Field Regime. In Kamalika Chaudhuri, Stefanie Jegelka, Le Song, Csaba Szepesvári, Gang Niu 0001, Sivan Sabato, editors, International Conference on Machine Learning, ICML 2022, 17-23 July 2022, Baltimore, Maryland, USA. Volume 162 of Proceedings of Machine Learning Research, pages 12222-12252, PMLR, 2022. [doi]

Authors

James-Michael Leahy

This author has not been identified. Look up 'James-Michael Leahy' in Google

Bekzhan Kerimkulov

This author has not been identified. Look up 'Bekzhan Kerimkulov' in Google

David Siska

This author has not been identified. Look up 'David Siska' in Google

Lukasz Szpruch

This author has not been identified. Look up 'Lukasz Szpruch' in Google