Learning a dynamic policy by using policy gradient: application to biped walking

Takamitsu Matsubara, Jun Morimoto, Jun Nakanishi, Masa-aki Sato, Kenji Doya. Learning a dynamic policy by using policy gradient: application to biped walking. Systems and Computers in Japan, 38(4):25-38, 2007. [doi]

Abstract

Abstract is missing.