Off-Policy Evaluation and Learning from Logged Bandit Feedback: Error Reduction via Surrogate Policy - researchr publication authors

researchr

You are not signed in
Sign in
Sign up

Yuan Xie, Boyi Liu, Qiang Liu 0001, Zhaoran Wang, Yuan Zhou, Jian Peng 0001. Off-Policy Evaluation and Learning from Logged Bandit Feedback: Error Reduction via Surrogate Policy. In 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019. OpenReview.net, 2019. [doi]

This author has not been identified. Look up 'Yuan Xie' in GoogleThis author has not been identified. Look up 'Boyi Liu' in GoogleThis author has not been identified. Look up 'Qiang Liu 0001' in GoogleThis author has not been identified. Look up 'Zhaoran Wang' in GoogleThis author has not been identified. Look up 'Yuan Zhou' in GoogleThis author has not been identified. Look up 'Jian Peng 0001' in Google

runs on WebDSL