Bigger Isn't Better: The Ethical and Scientific Vices of Extra-Large Datasets in Language Models

Trystan S. Goetze, Darren Abramson. Bigger Isn't Better: The Ethical and Scientific Vices of Extra-Large Datasets in Language Models. In Oshani Seneviratne, Vivek Singh, Ana Freire, Jar-der Luo, editors, WebSci '21: 13th ACM Web Science Conference 2021, Virtual Event, United Kingdom, 21-25 June, 2021, Companion Publication. pages 69-75, ACM, 2021. [doi]

Abstract

Abstract is missing.