Supported Policy Optimization for Offline Reinforcement Learning

Jialong Wu 0001, Haixu Wu, Zihan Qiu, Jianmin Wang 0001, Mingsheng Long. Supported Policy Optimization for Offline Reinforcement Learning. In Sanmi Koyejo, S. Mohamed, A. Agarwal, Danielle Belgrave, K. Cho, A. Oh, editors, Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, NeurIPS 2022, New Orleans, LA, USA, November 28 - December 9, 2022. 2022. [doi]

Abstract

Abstract is missing.