A Minimax Learning Approach to Off-Policy Evaluation in Confounded Partially Observable Markov Decision Processes

Chengchun Shi, Masatoshi Uehara, Jiawei Huang, Nan Jiang. A Minimax Learning Approach to Off-Policy Evaluation in Confounded Partially Observable Markov Decision Processes. In Kamalika Chaudhuri, Stefanie Jegelka, Le Song, Csaba Szepesvári, Gang Niu 0001, Sivan Sabato, editors, International Conference on Machine Learning, ICML 2022, 17-23 July 2022, Baltimore, Maryland, USA. Volume 162 of Proceedings of Machine Learning Research, pages 20057-20094, PMLR, 2022. [doi]

Abstract

Abstract is missing.