Combining policy gradient and Q-learning

Brendan O'Donoghue, RĂ©mi Munos, Koray Kavukcuoglu, Volodymyr Mnih. Combining policy gradient and Q-learning. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings. OpenReview.net, 2017. [doi]

Abstract

Abstract is missing.