A Tool for Benchmarking Large Language Models' Robustness in Assessing the Realism of Driving Scenarios

Jiahui Wu, Chengjie Lu, Aitor Arrieta, Shaukat Ali 0001. A Tool for Benchmarking Large Language Models' Robustness in Assessing the Realism of Driving Scenarios. In 2nd IEEE/ACM International Conference on AI-powered Software, AIware 2025, Seoul, Republic of Korea, November 19-20, 2025. pages 263-267, IEEE, 2025. [doi]

Abstract

Abstract is missing.