Off-policy TD( l) with a true online equivalence

Hado van Hasselt, Ashique Rupam Mahmood, Richard S. Sutton. Off-policy TD( l) with a true online equivalence. In Nevin L. Zhang, Jin Tian, editors, Proceedings of the Thirtieth Conference on Uncertainty in Artificial Intelligence, UAI 2014, Quebec City, Quebec, Canada, July 23-27, 2014. pages 330-339, AUAI Press, 2014. [doi]

Abstract

Abstract is missing.