Evaluate & Evaluation on the Hub: Better Best Practices for Data and Model Measurements

Leandro von Werra, Lewis Tunstall, Abhishek Thakur, Sasha Luccioni, Tristan Thrush, Aleksandra Piktus, Felix Marty, Nazneen Rajani, Victor Mustar, Helen Ngo. Evaluate & Evaluation on the Hub: Better Best Practices for Data and Model Measurements. In Wanxiang Che, Ekaterina Shutova, editors, Proceedings of the The 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022 - System Demonstrations, Abu Dhabi, UAE, December 7-11, 2022. pages 128-136, Association for Computational Linguistics, 2022. [doi]

Abstract

Abstract is missing.