Feel-Good Thompson Sampling for Contextual Bandits and Reinforcement Learning

Tong Zhang 0001. Feel-Good Thompson Sampling for Contextual Bandits and Reinforcement Learning. SIMODS, 4(2):834-857, June 2022. [doi]

Possibly Related Publications

The following publications are possibly variants of this publication: