IMMBA: Integrated Mixed Models with Bootstrap Analysis - A Statistical Framework for Robust LLM Evaluation

Vinícius Di Oliveira, Pedro Carvalho Brom, Li Weigang 0001. IMMBA: Integrated Mixed Models with Bootstrap Analysis - A Statistical Framework for Robust LLM Evaluation. In Karl Aberer, Josep Domingo-Ferrer, Massimo Marchiori, editors, Proceedings of the 21st International Conference on Web Information Systems and Technologies, WEBIST 2025, Marbella, Spain, October 21-23, 2025. pages 92-102, SCITEPRESS, 2025. [doi]

Abstract

Abstract is missing.