From a Smoking Gun to Spent Fuel: Principled Subsampling Methods for Building Big Language Data Corpora from Monitor Corpora

Jacqueline Hettel Tidwell. From a Smoking Gun to Spent Fuel: Principled Subsampling Methods for Building Big Language Data Corpora from Monitor Corpora. Data, 4(2):48, 2019. [doi]

Abstract

Abstract is missing.