PC-PG: Policy Cover Directed Exploration for Provable Policy Gradient Learning

Alekh Agarwal, Mikael Henaff, Sham M. Kakade, Wen Sun. PC-PG: Policy Cover Directed Exploration for Provable Policy Gradient Learning. In Hugo Larochelle, Marc'Aurelio Ranzato, Raia Hadsell, Maria-Florina Balcan, Hsuan-Tien Lin, editors, Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual. 2020. [doi]

Authors

Alekh Agarwal

This author has not been identified. Look up 'Alekh Agarwal' in Google

Mikael Henaff

This author has not been identified. Look up 'Mikael Henaff' in Google

Sham M. Kakade

This author has not been identified. Look up 'Sham M. Kakade' in Google

Wen Sun

This author has not been identified. Look up 'Wen Sun' in Google