OptiDICE: Offline Policy Optimization via Stationary Distribution Correction Estimation

Jongmin Lee 0004, Wonseok Jeon, Byung-Jun Lee 0001, Joelle Pineau, Kee-Eung Kim. OptiDICE: Offline Policy Optimization via Stationary Distribution Correction Estimation. In Marina Meila, Tong Zhang 0001, editors, Proceedings of the 38th International Conference on Machine Learning, ICML 2021, 18-24 July 2021, Virtual Event. Volume 139 of Proceedings of Machine Learning Research, pages 6120-6130, PMLR, 2021. [doi]

Abstract

Abstract is missing.