Batch learning from logged bandit feedback through counterfactual risk minimization

Adith Swaminathan, Thorsten Joachims. Batch learning from logged bandit feedback through counterfactual risk minimization. Journal of Machine Learning Research, 16:1731-1755, 2015. [doi]

Authors

Adith Swaminathan

This author has not been identified. Look up 'Adith Swaminathan' in Google

Thorsten Joachims

This author has not been identified. Look up 'Thorsten Joachims' in Google