Optimistic posterior sampling for reinforcement learning: worst-case regret bounds - researchr publication

researchr

You are not signed in
Sign in
Sign up

Shipra Agrawal, Randy Jia. Optimistic posterior sampling for reinforcement learning: worst-case regret bounds. In Isabelle Guyon, Ulrike von Luxburg, Samy Bengio, Hanna M. Wallach, Rob Fergus, S. V. N. Vishwanathan, Roman Garnett, editors, Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 4-9 December 2017, Long Beach, CA, USA. pages 1184-1194, 2017. [doi]

Abstract is missing.

runs on WebDSL