Beyond Cumulative Returns via Reinforcement Learning over State-Action Occupancy Measures

Junyu Zhang, Amrit Singh Bedi, Mengdi Wang, Alec Koppel. Beyond Cumulative Returns via Reinforcement Learning over State-Action Occupancy Measures. In 2021 American Control Conference, ACC 2021, New Orleans, LA, USA, May 25-28, 2021. pages 894-901, IEEE, 2021. [doi]

Abstract

Abstract is missing.