Concentration bounds for temporal difference learning with linear function approximation: the case of batch data and uniform sampling

Prashanth L. A., Nathaniel Korda, RĂ©mi Munos. Concentration bounds for temporal difference learning with linear function approximation: the case of batch data and uniform sampling. Machine Learning, 110(3):559-618, 2021. [doi]

Abstract

Abstract is missing.