Top-K Off-Policy Correction for a REINFORCE Recommender System

Minmin Chen, Alex Beutel, Paul Covington, Sagar Jain, Francois Belletti, Ed H. Chi. Top-K Off-Policy Correction for a REINFORCE Recommender System. In J. Shane Culpepper, Alistair Moffat, Paul N. Bennett, Kristina Lerman, editors, Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining, WSDM 2019, Melbourne, VIC, Australia, February 11-15, 2019. pages 456-464, ACM, 2019. [doi]

@inproceedings{ChenBCJBC19,
  title = {Top-K Off-Policy Correction for a REINFORCE Recommender System},
  author = {Minmin Chen and Alex Beutel and Paul Covington and Sagar Jain and Francois Belletti and Ed H. Chi},
  year = {2019},
  doi = {10.1145/3289600.3290999},
  url = {https://doi.org/10.1145/3289600.3290999},
  researchr = {https://researchr.org/publication/ChenBCJBC19},
  cites = {0},
  citedby = {0},
  pages = {456-464},
  booktitle = {Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining, WSDM 2019, Melbourne, VIC, Australia, February 11-15, 2019},
  editor = {J. Shane Culpepper and Alistair Moffat and Paul N. Bennett and Kristina Lerman},
  publisher = {ACM},
  isbn = {978-1-4503-5940-5},
}