Finite-Sample Analysis of Off-Policy TD-Learning via Generalized Bellman Operators

Zaiwei Chen, Siva Theja Maguluri, Sanjay Shakkottai, Karthikeyan Shanmugam. Finite-Sample Analysis of Off-Policy TD-Learning via Generalized Bellman Operators. In Marc'Aurelio Ranzato, Alina Beygelzimer, Yann N. Dauphin, Percy Liang, Jennifer Wortman Vaughan, editors, Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, December 6-14, 2021, virtual. pages 21440-21452, 2021. [doi]

Authors

Zaiwei Chen

This author has not been identified. Look up 'Zaiwei Chen' in Google

Siva Theja Maguluri

This author has not been identified. Look up 'Siva Theja Maguluri' in Google

Sanjay Shakkottai

This author has not been identified. Look up 'Sanjay Shakkottai' in Google

Karthikeyan Shanmugam

This author has not been identified. Look up 'Karthikeyan Shanmugam' in Google