Learning to Play No-Press Diplomacy with Best Response Policy Iteration

Thomas W. Anthony, Tom Eccles, Andrea Tacchetti, János Kramár, Ian M. Gemp, Thomas C. Hudson, Nicolas Porcel, Marc Lanctot, Julien Pérolat, Richard Everett 0001, Satinder Singh, Thore Graepel, Yoram Bachrach. Learning to Play No-Press Diplomacy with Best Response Policy Iteration. In Hugo Larochelle, Marc'Aurelio Ranzato, Raia Hadsell, Maria-Florina Balcan, Hsuan-Tien Lin, editors, Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual. 2020. [doi]

Abstract

Abstract is missing.