Target Networks and Over-parameterization Stabilize Off-policy Bootstrapping with Function Approximation

Fengdi Che, Chenjun Xiao, Jincheng Mei, Bo Dai 0001, Ramki Gummadi, Oscar A. Ramirez, Christopher K. Harris, A. Rupam Mahmood, Dale Schuurmans. Target Networks and Over-parameterization Stabilize Off-policy Bootstrapping with Function Approximation. In Forty-first International Conference on Machine Learning, ICML 2024, Vienna, Austria, July 21-27, 2024. OpenReview.net, 2024. [doi]

Authors

Fengdi Che

This author has not been identified. Look up 'Fengdi Che' in Google

Chenjun Xiao

This author has not been identified. Look up 'Chenjun Xiao' in Google

Jincheng Mei

This author has not been identified. Look up 'Jincheng Mei' in Google

Bo Dai 0001

This author has not been identified. Look up 'Bo Dai 0001' in Google

Ramki Gummadi

This author has not been identified. Look up 'Ramki Gummadi' in Google

Oscar A. Ramirez

This author has not been identified. Look up 'Oscar A. Ramirez' in Google

Christopher K. Harris

This author has not been identified. Look up 'Christopher K. Harris' in Google

A. Rupam Mahmood

This author has not been identified. Look up 'A. Rupam Mahmood' in Google

Dale Schuurmans

This author has not been identified. Look up 'Dale Schuurmans' in Google