Improving Policy Gradient Estimates with Influence Information

Jervis Pinto, Alan Fern, Tim Bauer, Martin Erwig. Improving Policy Gradient Estimates with Influence Information. Journal of Machine Learning Research, 20:1-18, 2011. [doi]