Bias no more: high-probability data-dependent regret bounds for adversarial bandits and MDPs

Chung-wei Lee, Haipeng Luo, Chen-Yu Wei, Mengxiao Zhang. Bias no more: high-probability data-dependent regret bounds for adversarial bandits and MDPs. In Hugo Larochelle, Marc'Aurelio Ranzato, Raia Hadsell, Maria-Florina Balcan, Hsuan-Tien Lin, editors, Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual. 2020. [doi]

Authors

Chung-wei Lee

This author has not been identified. Look up 'Chung-wei Lee' in Google

Haipeng Luo

This author has not been identified. Look up 'Haipeng Luo' in Google

Chen-Yu Wei

This author has not been identified. Look up 'Chen-Yu Wei' in Google

Mengxiao Zhang

This author has not been identified. Look up 'Mengxiao Zhang' in Google