Softmax Policy Gradient Methods Can Take Exponential Time to Converge - researchr publication

researchr

You are not signed in
Sign in
Sign up

Gen Li 0005, Yuting Wei, Yuejie Chi, Yuantao Gu, Yuxin Chen 0002. Softmax Policy Gradient Methods Can Take Exponential Time to Converge. In Mikhail Belkin, Samory Kpotufe, editors, Conference on Learning Theory, COLT 2021, 15-19 August 2021, Boulder, Colorado, USA. Volume 134 of Proceedings of Machine Learning Research, pages 3107-3110, PMLR, 2021. [doi]

Abstract is missing.

runs on WebDSL