An Emphatic Approach to the Problem of Off-policy Temporal-Difference Learning

Richard S. Sutton, Ashique Rupam Mahmood, Martha White. An Emphatic Approach to the Problem of Off-policy Temporal-Difference Learning. Journal of Machine Learning Research, 17, 2016. [doi]

Authors

Richard S. Sutton

This author has not been identified. Look up 'Richard S. Sutton' in Google

Ashique Rupam Mahmood

This author has not been identified. Look up 'Ashique Rupam Mahmood' in Google

Martha White

This author has not been identified. Look up 'Martha White' in Google