Chatbot Arena Estimate: towards a generalized performance benchmark for LLM capabilities - researchr publication

researchr

You are not signed in
Sign in
Sign up

Lucas Spangher, Tianle Li, William F. Arnold, Nick Masiewicki, Xerxes Dotiwalla, Rama Kumar Pasumarthi, Peter Grabowski, Eugene Ie, Daniel Gruhl. Chatbot Arena Estimate: towards a generalized performance benchmark for LLM capabilities. In Weizhu Chen, Yi Yang, Mohammad Kachuee, Xue-Yong Fu, editors, Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL 2025 - Volume 3: Industry Track, Albuquerque, New Mexico, USA, April 30, 2025. pages 1016-1025, Association for Computational Linguistics, 2025. [doi]

Abstract is missing.

runs on WebDSL