Researchr is a web site for finding, collecting, sharing, and reviewing scientific publications, for researchers by researchers.
Sign up for an account to create a profile with publication list, tag and review your related work, and share bibliographies with your co-authors.
Richard S. Sutton, Ashique Rupam Mahmood, Martha White. An Emphatic Approach to the Problem of Off-policy Temporal-Difference Learning. Journal of Machine Learning Research, 17, 2016. [doi]
Possibly Related PublicationsThe following publications are possibly variants of this publication: Loosely consistent emphatic temporal-difference learningJiamin He, Fengdi Che, Yi Wan, A. Rupam Mahmood. uai 2023: 849-859 [doi]
The following publications are possibly variants of this publication: