Experiments with Infinite-Horizon, Policy-Gradient Estimation

Jonathan Baxter, Peter L. Bartlett, Lex Weaver. Experiments with Infinite-Horizon, Policy-Gradient Estimation. J. Artif. Intell. Res. (JAIR), 15:351-381, 2001. [doi]

Abstract

Abstract is missing.