Experiments with Infinite-Horizon, Policy-Gradient Estimation

Jonathan Baxter, Peter L. Bartlett, Lex Weaver. Experiments with Infinite-Horizon, Policy-Gradient Estimation. J. Artif. Intell. Res. (JAIR), 15:351-381, 2001. [doi]

Authors

Jonathan Baxter

This author has not been identified. Look up 'Jonathan Baxter' in Google

Peter L. Bartlett

This author has not been identified. Look up 'Peter L. Bartlett' in Google

Lex Weaver

This author has not been identified. Look up 'Lex Weaver' in Google