Nearly Optimal Policy Optimization with Stable at Any Time Guarantee

Tianhao Wu, Yunchang Yang, Han Zhong, Liwei Wang 0001, Simon S. Du, Jiantao Jiao. Nearly Optimal Policy Optimization with Stable at Any Time Guarantee. In Kamalika Chaudhuri, Stefanie Jegelka, Le Song, Csaba Szepesvári, Gang Niu 0001, Sivan Sabato, editors, International Conference on Machine Learning, ICML 2022, 17-23 July 2022, Baltimore, Maryland, USA. Volume 162 of Proceedings of Machine Learning Research, pages 24243-24265, PMLR, 2022. [doi]

Authors

Tianhao Wu

This author has not been identified. Look up 'Tianhao Wu' in Google

Yunchang Yang

This author has not been identified. Look up 'Yunchang Yang' in Google

Han Zhong

This author has not been identified. Look up 'Han Zhong' in Google

Liwei Wang 0001

This author has not been identified. Look up 'Liwei Wang 0001' in Google

Simon S. Du

This author has not been identified. Look up 'Simon S. Du' in Google

Jiantao Jiao

This author has not been identified. Look up 'Jiantao Jiao' in Google