Learning Optimal Policies in Markov Decision Processes with Value Function Discovery?

Martijn Onderwater, Sandjai Bhulai, Rob van der Mei. Learning Optimal Policies in Markov Decision Processes with Value Function Discovery?. SIGMETRICS Performance Evaluation Review, 43(2):7-9, 2015. [doi]

Abstract

Abstract is missing.