Learning to Play No-Press Diplomacy with Best Response Policy Iteration

Thomas W. Anthony, Tom Eccles, Andrea Tacchetti, János Kramár, Ian M. Gemp, Thomas C. Hudson, Nicolas Porcel, Marc Lanctot, Julien Pérolat, Richard Everett 0001, Satinder Singh, Thore Graepel, Yoram Bachrach. Learning to Play No-Press Diplomacy with Best Response Policy Iteration. In Hugo Larochelle, Marc'Aurelio Ranzato, Raia Hadsell, Maria-Florina Balcan, Hsuan-Tien Lin, editors, Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual. 2020. [doi]

Authors

Thomas W. Anthony

This author has not been identified. Look up 'Thomas W. Anthony' in Google

Tom Eccles

This author has not been identified. Look up 'Tom Eccles' in Google

Andrea Tacchetti

This author has not been identified. Look up 'Andrea Tacchetti' in Google

János Kramár

This author has not been identified. Look up 'János Kramár' in Google

Ian M. Gemp

This author has not been identified. Look up 'Ian M. Gemp' in Google

Thomas C. Hudson

This author has not been identified. Look up 'Thomas C. Hudson' in Google

Nicolas Porcel

This author has not been identified. Look up 'Nicolas Porcel' in Google

Marc Lanctot

This author has not been identified. Look up 'Marc Lanctot' in Google

Julien Pérolat

This author has not been identified. Look up 'Julien Pérolat' in Google

Richard Everett 0001

This author has not been identified. Look up 'Richard Everett 0001' in Google

Satinder Singh

This author has not been identified. Look up 'Satinder Singh' in Google

Thore Graepel

This author has not been identified. Look up 'Thore Graepel' in Google

Yoram Bachrach

This author has not been identified. Look up 'Yoram Bachrach' in Google