Feel-Good Thompson Sampling for Contextual Bandits and Reinforcement Learning

Tong Zhang 0001. Feel-Good Thompson Sampling for Contextual Bandits and Reinforcement Learning. SIMODS, 4(2):834-857, June 2022. [doi]

Abstract

Abstract is missing.