Sample-efficient Nonstationary Policy Evaluation for Contextual Bandits

Miroslav Dudík, Dumitru Erhan, John Langford, Lihong Li. Sample-efficient Nonstationary Policy Evaluation for Contextual Bandits. In Nando de Freitas, Kevin P. Murphy, editors, Proceedings of the Twenty-Eighth Conference on Uncertainty in Artificial Intelligence, Catalina Island, CA, USA, August 14-18, 2012. pages 247-254, AUAI Press, 2012. [doi]

Abstract

Abstract is missing.