Dynamic Intelligence Assessment: Benchmarking LLMs on the Road to AGI with a Focus on Model Confidence

Norbert Tihanyi, Tamás Bisztray, Richard A. Dubniczky, Rebeka Tóth, Bertalan Borsos, Bilel Cherif, Ridhi Jain, Lajos Muzsai, Mohamed Amine Ferrag, Ryan Marinelli, Lucas C. Cordeiro, Mérouane Debbah, Vasileios Mavroeidis, Audun Jøsang. Dynamic Intelligence Assessment: Benchmarking LLMs on the Road to AGI with a Focus on Model Confidence. In Wei Ding 0003, Chang-Tien Lu, Fusheng Wang 0001, Liping Di, Kesheng Wu, Jun Huan, Raghu Nambiar, Jundong Li, Filip Ilievski, Ricardo Baeza-Yates, Xiaohua Hu, editors, IEEE International Conference on Big Data, BigData 2024, Washington, DC, USA, December 15-18, 2024. pages 3313-3321, IEEE, 2024. [doi]