Learning to Play No-Press Diplomacy with Best Response Policy Iteration

Thomas W. Anthony, Tom Eccles, Andrea Tacchetti, János Kramár, Ian M. Gemp, Thomas C. Hudson, Nicolas Porcel, Marc Lanctot, Julien Pérolat, Richard Everett 0001, Satinder Singh, Thore Graepel, Yoram Bachrach. Learning to Play No-Press Diplomacy with Best Response Policy Iteration. In Hugo Larochelle, Marc'Aurelio Ranzato, Raia Hadsell, Maria-Florina Balcan, Hsuan-Tien Lin, editors, Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual. 2020. [doi]

@inproceedings{AnthonyETKGHPLP20,
  title = {Learning to Play No-Press Diplomacy with Best Response Policy Iteration},
  author = {Thomas W. Anthony and Tom Eccles and Andrea Tacchetti and János Kramár and Ian M. Gemp and Thomas C. Hudson and Nicolas Porcel and Marc Lanctot and Julien Pérolat and Richard Everett 0001 and Satinder Singh and Thore Graepel and Yoram Bachrach},
  year = {2020},
  url = {https://proceedings.neurips.cc/paper/2020/hash/d1419302db9c022ab1d48681b13d5f8b-Abstract.html},
  researchr = {https://researchr.org/publication/AnthonyETKGHPLP20},
  cites = {0},
  citedby = {0},
  booktitle = {Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual},
  editor = {Hugo Larochelle and Marc'Aurelio Ranzato and Raia Hadsell and Maria-Florina Balcan and Hsuan-Tien Lin},
}