Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path

András Antos, Csaba Szepesvári, Rémi Munos. Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path. Machine Learning, 71(1):89-129, 2008. [doi]

Abstract

Abstract is missing.