Posterior Value Functions: Hindsight Baselines for Policy Gradient Methods - researchr publication authors

researchr

You are not signed in
Sign in
Sign up

Chris Nota, Philip Thomas, Bruno C. da Silva. Posterior Value Functions: Hindsight Baselines for Policy Gradient Methods. In Marina Meila, Tong Zhang 0001, editors, Proceedings of the 38th International Conference on Machine Learning, ICML 2021, 18-24 July 2021, Virtual Event. Volume 139 of Proceedings of Machine Learning Research, pages 8238-8247, PMLR, 2021. [doi]

This author has not been identified. Look up 'Chris Nota' in GoogleThis author has not been identified. Look up 'Philip Thomas' in GoogleThis author has not been identified. Look up 'Bruno C. da Silva' in Google

runs on WebDSL