A learning algorithm for the finite-time two-armed bandit problem

Mitsuo Sato, Kenichi Abe, Hiroshi Takeda. A learning algorithm for the finite-time two-armed bandit problem. IEEE Transactions on Systems, Man, and Cybernetics, Part A, 14(3):528-534, 1984. [doi]

Abstract

Abstract is missing.