Horizon-Free Reinforcement Learning in Polynomial Time: the Power of Stationary Policies

Zihan Zhang, Xiangyang Ji, Simon S. Du. Horizon-Free Reinforcement Learning in Polynomial Time: the Power of Stationary Policies. In Po-Ling Loh, Maxim Raginsky, editors, Conference on Learning Theory, 2-5 July 2022, London, UK. Volume 178 of Proceedings of Machine Learning Research, pages 3858-3904, PMLR, 2022. [doi]

Abstract

Abstract is missing.