Learning Policies from Self-Play with Policy Gradients and MCTS Value Estimates

Dennis J. N. J. Soemers, Éric Piette, Matthew Stephenson, Cameron Browne. Learning Policies from Self-Play with Policy Gradients and MCTS Value Estimates. In IEEE Conference on Games, CoG 2019, London, United Kingdom, August 20-23, 2019. pages 1-8, IEEE, 2019. [doi]

Authors

Dennis J. N. J. Soemers

This author has not been identified. Look up 'Dennis J. N. J. Soemers' in Google

Éric Piette

This author has not been identified. Look up 'Éric Piette' in Google

Matthew Stephenson

This author has not been identified. Look up 'Matthew Stephenson' in Google

Cameron Browne

This author has not been identified. Look up 'Cameron Browne' in Google