Minimax Off-Policy Evaluation for Multi-Armed Bandits

Cong Ma, Banghua Zhu, Jiantao Jiao, Martin J. Wainwright. Minimax Off-Policy Evaluation for Multi-Armed Bandits. IEEE Transactions on Information Theory, 68(8):5314-5339, 2022. [doi]

@article{MaZJW22,
  title = {Minimax Off-Policy Evaluation for Multi-Armed Bandits},
  author = {Cong Ma and Banghua Zhu and Jiantao Jiao and Martin J. Wainwright},
  year = {2022},
  doi = {10.1109/TIT.2022.3162335},
  url = {https://doi.org/10.1109/TIT.2022.3162335},
  researchr = {https://researchr.org/publication/MaZJW22},
  cites = {0},
  citedby = {0},
  journal = {IEEE Transactions on Information Theory},
  volume = {68},
  number = {8},
  pages = {5314-5339},
}