Safe Policy Improvement by Minimizing Robust Baseline Regret

Mohammad Ghavamzadeh, Marek Petrik, Yinlam Chow. Safe Policy Improvement by Minimizing Robust Baseline Regret. In Daniel D. Lee, Masashi Sugiyama, Ulrike V. Luxburg, Isabelle Guyon, Roman Garnett, editors, Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, December 5-10, 2016, Barcelona, Spain. pages 2298-2306, 2016. [doi]

Authors

Mohammad Ghavamzadeh

This author has not been identified. Look up 'Mohammad Ghavamzadeh' in Google

Marek Petrik

This author has not been identified. Look up 'Marek Petrik' in Google

Yinlam Chow

This author has not been identified. Look up 'Yinlam Chow' in Google