On the Theory of Policy Gradient Methods: Optimality, Approximation, and Distribution Shift - researchr publication

researchr

You are not signed in
Sign in
Sign up

Alekh Agarwal, Sham M. Kakade, Jason D. Lee, Gaurav Mahajan. On the Theory of Policy Gradient Methods: Optimality, Approximation, and Distribution Shift. Journal of Machine Learning Research, 22, 2021. [doi]

Abstract is missing.

runs on WebDSL