Cautiously Optimistic Policy Optimization and Exploration with Linear Function Approximation

Andrea Zanette, Ching-An Cheng, Alekh Agarwal. Cautiously Optimistic Policy Optimization and Exploration with Linear Function Approximation. In Mikhail Belkin, Samory Kpotufe, editors, Conference on Learning Theory, COLT 2021, 15-19 August 2021, Boulder, Colorado, USA. Volume 134 of Proceedings of Machine Learning Research, pages 4473-4525, PMLR, 2021. [doi]

Abstract

Abstract is missing.