LPPG-RL: Lexicographically Projected Policy Gradient Reinforcement Learning with Subproblem Exploration

Ruiyu Qiu, Rui Wang, Guanghui Yang, Xiang Li, Zhijiang Shao. LPPG-RL: Lexicographically Projected Policy Gradient Reinforcement Learning with Subproblem Exploration. In Sven Koenig, Chad Jenkins, Matthew E. Taylor, editors, Fortieth AAAI Conference on Artificial Intelligence, Thirty-Eighth Conference on Innovative Applications of Artificial Intelligence, Sixteenth Symposium on Educational Advances in Artificial Intelligence, AAAI 2026, Singapore, January 20-27, 2026. pages 25009-25017, AAAI Press, 2026. [doi]

Authors

Ruiyu Qiu

This author has not been identified. Look up 'Ruiyu Qiu' in Google

Rui Wang

This author has not been identified. Look up 'Rui Wang' in Google

Guanghui Yang

This author has not been identified. Look up 'Guanghui Yang' in Google

Xiang Li

This author has not been identified. Look up 'Xiang Li' in Google

Zhijiang Shao

This author has not been identified. Look up 'Zhijiang Shao' in Google