Learning Intrinsic Rewards as a Bi-Level Optimization Problem

Bradly Stadie, Lunjun Zhang, Jimmy Ba. Learning Intrinsic Rewards as a Bi-Level Optimization Problem. In Ryan P. Adams, Vibhav Gogate, editors, Proceedings of the Thirty-Sixth Conference on Uncertainty in Artificial Intelligence, UAI 2020, virtual online, August 3-6, 2020. pages 66, AUAI Press, 2020. [doi]

Abstract

Abstract is missing.