Learning to Plan Variable Length Sequences of Actions with a Cascading Bandit Click Model of User Feedback - researchr publication authors

researchr

You are not signed in
Sign in
Sign up

Anirban Santara, Gaurav Aggarwal, Shuai Li, Claudio Gentile. Learning to Plan Variable Length Sequences of Actions with a Cascading Bandit Click Model of User Feedback. In Gustau Camps-Valls, Francisco J. R. Ruiz, Isabel Valera, editors, International Conference on Artificial Intelligence and Statistics, AISTATS 2022, 28-30 March 2022, Virtual Event. Volume 151 of Proceedings of Machine Learning Research, pages 767-797, PMLR, 2022. [doi]

This author has not been identified. Look up 'Anirban Santara' in GoogleThis author has not been identified. Look up 'Gaurav Aggarwal' in GoogleThis author has not been identified. Look up 'Shuai Li' in GoogleThis author has not been identified. Look up 'Claudio Gentile' in Google

runs on WebDSL