Whittle index based Q-learning for restless bandits with average reward

Konstantin E. Avrachenkov, Vivek S. Borkar. Whittle index based Q-learning for restless bandits with average reward. Automatica, 139:110186, 2022. [doi]

Abstract

Abstract is missing.