Stable Policy Optimization via Off-Policy Divergence Regularization

Ahmed Touati, Amy Zhang 0001, Joelle Pineau, Pascal Vincent. Stable Policy Optimization via Off-Policy Divergence Regularization. In Ryan P. Adams, Vibhav Gogate, editors, Proceedings of the Thirty-Sixth Conference on Uncertainty in Artificial Intelligence, UAI 2020, virtual online, August 3-6, 2020. pages 543, AUAI Press, 2020. [doi]

Abstract

Abstract is missing.