Off-policy evaluation for tabular reinforcement learning with synthetic trajectories

Weiwei Wang, Yuqiang Li, Xianyi Wu. Off-policy evaluation for tabular reinforcement learning with synthetic trajectories. Statistics and Computing, 34(1):41, February 2024. [doi]

Abstract

Abstract is missing.