Value Penalized Q-Learning for Recommender Systems

Chengqian Gao, Ke Xu, Kuangqi Zhou, Lanqing Li, Xueqian Wang, Bo Yuan, Peilin Zhao. Value Penalized Q-Learning for Recommender Systems. In Enrique Amigó, Pablo Castells, Julio Gonzalo, Ben Carterette, J. Shane Culpepper, Gabriella Kazai, editors, SIGIR '22: The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, Madrid, Spain, July 11 - 15, 2022. pages 2008-2012, ACM, 2022. [doi]

@inproceedings{GaoXZLWYZ22,
  title = {Value Penalized Q-Learning for Recommender Systems},
  author = {Chengqian Gao and Ke Xu and Kuangqi Zhou and Lanqing Li and Xueqian Wang and Bo Yuan and Peilin Zhao},
  year = {2022},
  doi = {10.1145/3477495.3531796},
  url = {https://doi.org/10.1145/3477495.3531796},
  researchr = {https://researchr.org/publication/GaoXZLWYZ22},
  cites = {0},
  citedby = {0},
  pages = {2008-2012},
  booktitle = {SIGIR '22: The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, Madrid, Spain, July 11 - 15, 2022},
  editor = {Enrique Amigó and Pablo Castells and Julio Gonzalo and Ben Carterette and J. Shane Culpepper and Gabriella Kazai},
  publisher = {ACM},
  isbn = {978-1-4503-8732-3},
}