Towards Minimax Policies for Online Linear Optimization with Bandit Feedback

Sébastien Bubeck, Nicolò Cesa-Bianchi, Sham M. Kakade. Towards Minimax Policies for Online Linear Optimization with Bandit Feedback. Journal of Machine Learning Research, 23, 2012. [doi]

Authors

Sébastien Bubeck

This author has not been identified. Look up 'Sébastien Bubeck' in Google

Nicolò Cesa-Bianchi

This author has not been identified. Look up 'Nicolò Cesa-Bianchi' in Google

Sham M. Kakade

This author has not been identified. Look up 'Sham M. Kakade' in Google