Batch learning from logged bandit feedback through counterfactual risk minimization

Adith Swaminathan, Thorsten Joachims. Batch learning from logged bandit feedback through counterfactual risk minimization. Journal of Machine Learning Research, 16:1731-1755, 2015. [doi]

Abstract

Abstract is missing.