Learning Policies from Self-Play with Policy Gradients and MCTS Value Estimates

Dennis J. N. J. Soemers, Éric Piette, Matthew Stephenson, Cameron Browne. Learning Policies from Self-Play with Policy Gradients and MCTS Value Estimates. In IEEE Conference on Games, CoG 2019, London, United Kingdom, August 20-23, 2019. pages 1-8, IEEE, 2019. [doi]

@inproceedings{SoemersPSB19,
  title = {Learning Policies from Self-Play with Policy Gradients and MCTS Value Estimates},
  author = {Dennis J. N. J. Soemers and Éric Piette and Matthew Stephenson and Cameron Browne},
  year = {2019},
  doi = {10.1109/CIG.2019.8848037},
  url = {https://doi.org/10.1109/CIG.2019.8848037},
  researchr = {https://researchr.org/publication/SoemersPSB19},
  cites = {0},
  citedby = {0},
  pages = {1-8},
  booktitle = {IEEE Conference on Games, CoG 2019, London, United Kingdom, August 20-23, 2019},
  publisher = {IEEE},
  isbn = {978-1-7281-1884-0},
}