Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path

András Antos, Csaba Szepesvári, Rémi Munos. Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path. Machine Learning, 71(1):89-129, 2008. [doi]

Possibly Related Publications

The following publications are possibly variants of this publication: