Sample-efficient Nonstationary Policy Evaluation for Contextual Bandits - researchr publication

researchr

You are not signed in
Sign in
Sign up

Miroslav Dudík, Dumitru Erhan, John Langford, Lihong Li. Sample-efficient Nonstationary Policy Evaluation for Contextual Bandits. In Nando de Freitas, Kevin P. Murphy, editors, Proceedings of the Twenty-Eighth Conference on Uncertainty in Artificial Intelligence, Catalina Island, CA, USA, August 14-18, 2012. pages 247-254, AUAI Press, 2012. [doi]

Abstract is missing.

runs on WebDSL