Target Networks and Over-parameterization Stabilize Off-policy Bootstrapping with Function Approximation

Fengdi Che, Chenjun Xiao, Jincheng Mei, Bo Dai 0001, Ramki Gummadi, Oscar A. Ramirez, Christopher K. Harris, A. Rupam Mahmood, Dale Schuurmans. Target Networks and Over-parameterization Stabilize Off-policy Bootstrapping with Function Approximation. In Forty-first International Conference on Machine Learning, ICML 2024, Vienna, Austria, July 21-27, 2024. OpenReview.net, 2024. [doi]

Abstract

Abstract is missing.