GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning

Jian Zhao, Runze Liu 0002, Kaiyan Zhang, Zhimu Zhou, Junqi Gao, Dong Li 0016, Jiafei Lyu, Zhouyi Qian, Biqing Qi, Xiu Li 0001, Bowen Zhou 0002. GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning. In Sven Koenig, Chad Jenkins, Matthew E. Taylor, editors, Fortieth AAAI Conference on Artificial Intelligence, Thirty-Eighth Conference on Innovative Applications of Artificial Intelligence, Sixteenth Symposium on Educational Advances in Artificial Intelligence, AAAI 2026, Singapore, January 20-27, 2026. pages 34932-34940, AAAI Press, 2026. [doi]

Abstract

Abstract is missing.