Global optimality of softmax policy gradient with single hidden layer neural networks in the mean-field regime - researchr publication

researchr

You are not signed in
Sign in
Sign up

Andrea Agazzi, Jianfeng Lu. Global optimality of softmax policy gradient with single hidden layer neural networks in the mean-field regime. In 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021. OpenReview.net, 2021. [doi]

Abstract is missing.

runs on WebDSL