Per-decision Multi-step Temporal Difference Learning with Control Variates

Kristopher De Asis, Richard S. Sutton. Per-decision Multi-step Temporal Difference Learning with Control Variates. In Amir Globerson, Ricardo Silva, editors, Proceedings of the Thirty-Fourth Conference on Uncertainty in Artificial Intelligence, UAI 2018, Monterey, California, USA, August 6-10, 2018. pages 786-794, AUAI Press, 2018. [doi]

Authors

Kristopher De Asis

This author has not been identified. Look up 'Kristopher De Asis' in Google

Richard S. Sutton

This author has not been identified. Look up 'Richard S. Sutton' in Google