Horizon-Free Reinforcement Learning in Polynomial Time: the Power of Stationary Policies

Zihan Zhang, Xiangyang Ji, Simon S. Du. Horizon-Free Reinforcement Learning in Polynomial Time: the Power of Stationary Policies. In Po-Ling Loh, Maxim Raginsky, editors, Conference on Learning Theory, 2-5 July 2022, London, UK. Volume 178 of Proceedings of Machine Learning Research, pages 3858-3904, PMLR, 2022. [doi]

Authors

Zihan Zhang

This author has not been identified. Look up 'Zihan Zhang' in Google

Xiangyang Ji

This author has not been identified. Look up 'Xiangyang Ji' in Google

Simon S. Du

This author has not been identified. Look up 'Simon S. Du' in Google