RedCode: Risky Code Execution and Generation Benchmark for Code Agents

Chengquan Guo, Xun Liu, Chulin Xie, Andy Zhou, Yi Zeng, Zinan Lin 0001, Dawn Song, Bo Li. RedCode: Risky Code Execution and Generation Benchmark for Code Agents. In Amir Globersons, Lester Mackey, Danielle Belgrave, Angela Fan, Ulrich Paquet, Jakub M. Tomczak, Cheng Zhang 0005, editors, Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, NeurIPS 2024, Vancouver, BC, Canada, December 10 - 15, 2024. 2024. [doi]

Authors

Chengquan Guo

This author has not been identified. Look up 'Chengquan Guo' in Google

Xun Liu

This author has not been identified. Look up 'Xun Liu' in Google

Chulin Xie

This author has not been identified. Look up 'Chulin Xie' in Google

Andy Zhou

This author has not been identified. Look up 'Andy Zhou' in Google

Yi Zeng

This author has not been identified. Look up 'Yi Zeng' in Google

Zinan Lin 0001

This author has not been identified. Look up 'Zinan Lin 0001' in Google

Dawn Song

This author has not been identified. Look up 'Dawn Song' in Google

Bo Li

This author has not been identified. Look up 'Bo Li' in Google