Characterizing the Confidence of Large Language Model-Based Automatic Evaluation Metrics

Rickard Stureborg, Dimitris Alikaniotis, Yoshi Suhara. Characterizing the Confidence of Large Language Model-Based Automatic Evaluation Metrics. In Yvette Graham, Matthew Purver, editors, Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2024 - Volume 2: Short Papers, St. Julian's, Malta, March 17-22, 2024. pages 76-89, Association for Computational Linguistics, 2024. [doi]

Abstract

Abstract is missing.