The Nature of Temporal Difference Errors in Multi-step Distributional Reinforcement Learning - researchr publication related

researchr

You are not signed in
Sign in
Sign up

Yunhao Tang, Rémi Munos, Mark Rowland, Bernardo Ávila Pires, Will Dabney, Marc G. Bellemare. The Nature of Temporal Difference Errors in Multi-step Distributional Reinforcement Learning. In Sanmi Koyejo, S. Mohamed, A. Agarwal, Danielle Belgrave, K. Cho, A. Oh, editors, Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, NeurIPS 2022, New Orleans, LA, USA, November 28 - December 9, 2022. 2022. [doi]

The following publications are possibly variants of this publication:

A distributional code for value in dopamine-based reinforcement learningWill Dabney, Zeb Kurth-Nelson, Naoshige Uchida, Clara Kwon Starkweather, Demis Hassabis, Rémi Munos, Matthew Botvinick. nature, 577(7792):671-675, 2020. [doi]

Human-level control through deep reinforcement learningVolodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness, Marc G. Bellemare, Alex Graves, Martin A. Riedmiller, Andreas Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg, Demis Hassabis. nature, 518(7540):529-533, 2015. [doi]

Autonomous navigation of stratospheric balloons using reinforcement learningMarc G. Bellemare, Salvatore Candido, Pablo Samuel Castro, Jun Gong, Marlos C. Machado, Subhodeep Moitra, Sameera S. Ponda, Ziyu Wang. nature, 588(7836):77-82, 2020. [doi]

runs on WebDSL