Counterfactual Risk Minimization: Learning from Logged Bandit Feedback

Adith Swaminathan, Thorsten Joachims. Counterfactual Risk Minimization: Learning from Logged Bandit Feedback. In Francis R. Bach, David M. Blei, editors, Proceedings of the 32nd International Conference on Machine Learning, ICML 2015, Lille, France, 6-11 July 2015. Volume 37 of JMLR Proceedings, pages 814-823, JMLR.org, 2015. [doi]

@inproceedings{SwaminathanJ15-0,
  title = {Counterfactual Risk Minimization: Learning from Logged Bandit Feedback},
  author = {Adith Swaminathan and Thorsten Joachims},
  year = {2015},
  url = {http://jmlr.org/proceedings/papers/v37/swaminathan15.html},
  researchr = {https://researchr.org/publication/SwaminathanJ15-0},
  cites = {0},
  citedby = {0},
  pages = {814-823},
  booktitle = {Proceedings of the 32nd International Conference on Machine Learning, ICML 2015, Lille, France, 6-11 July 2015},
  editor = {Francis R. Bach and David M. Blei},
  volume = {37},
  series = {JMLR Proceedings},
  publisher = {JMLR.org},
}