Optimal Estimation of Policy Gradient via Double Fitted Iteration

Chengzhuo Ni, Ruiqi Zhang, Xiang Ji, Xuezhou Zhang, Mengdi Wang. Optimal Estimation of Policy Gradient via Double Fitted Iteration. In Kamalika Chaudhuri, Stefanie Jegelka, Le Song, Csaba Szepesvári, Gang Niu 0001, Sivan Sabato, editors, International Conference on Machine Learning, ICML 2022, 17-23 July 2022, Baltimore, Maryland, USA. Volume 162 of Proceedings of Machine Learning Research, pages 16724-16783, PMLR, 2022. [doi]

@inproceedings{NiZJZW22,
  title = {Optimal Estimation of Policy Gradient via Double Fitted Iteration},
  author = {Chengzhuo Ni and Ruiqi Zhang and Xiang Ji and Xuezhou Zhang and Mengdi Wang},
  year = {2022},
  url = {https://proceedings.mlr.press/v162/ni22b.html},
  researchr = {https://researchr.org/publication/NiZJZW22},
  cites = {0},
  citedby = {0},
  pages = {16724-16783},
  booktitle = {International Conference on Machine Learning, ICML 2022, 17-23 July 2022, Baltimore, Maryland, USA},
  editor = {Kamalika Chaudhuri and Stefanie Jegelka and Le Song and Csaba Szepesvári and Gang Niu 0001 and Sivan Sabato},
  volume = {162},
  series = {Proceedings of Machine Learning Research},
  publisher = {PMLR},
}