Diagnosing Bias and Instability in LLM Evaluation: A Scalable Pairwise Meta-Evaluator

Catalin Anghel, Andreea Alexandra Anghel, Emilia Pecheanu, Adina Cocu, Adrian Istrate, Constantin Adrian Andrei. Diagnosing Bias and Instability in LLM Evaluation: A Scalable Pairwise Meta-Evaluator. Information, 16(8):652, 2025. [doi]

Abstract

Abstract is missing.