A Heuristic Q-Learning Architecture for Fully Exploring a World and Deriving an Optimal Policy by Model-Based Planning

Gang Zhao, Shoji Tatsumi, Ruoying Sun. A Heuristic Q-Learning Architecture for Fully Exploring a World and Deriving an Optimal Policy by Model-Based Planning. In ICRA. pages 2078-2083, 1999.

Abstract

Abstract is missing.