Learning Near-Optimal Policies with Bellman-Residual Minimization Based Fitted Policy Iteration and a Single Sample Path

András Antos, Csaba Szepesvári, Rémi Munos. Learning Near-Optimal Policies with Bellman-Residual Minimization Based Fitted Policy Iteration and a Single Sample Path. In Gábor Lugosi, Hans-Ulrich Simon, editors, Learning Theory, 19th Annual Conference on Learning Theory, COLT 2006, Pittsburgh, PA, USA, June 22-25, 2006, Proceedings. Volume 4005 of Lecture Notes in Computer Science, pages 574-588, Springer, 2006. [doi]

Abstract

Abstract is missing.