Controller exploitation-exploration reinforcement learning architecture for computing near-optimal policies

Erick Asiain, Julio B. Clempner, Alexander S. Poznyak. Controller exploitation-exploration reinforcement learning architecture for computing near-optimal policies. Soft Comput., 23(11):3591-3604, 2019. [doi]

Abstract

Abstract is missing.