Website Properties in Relation to the Quality of Text Extracted for Web Corpora

Vít Suchomel, Jan Kraus. Website Properties in Relation to the Quality of Text Extracted for Web Corpora. In Ales Horák, Pavel Rychlý, Adam Rambousek, editors, The 15th Workshop on Recent Advances in Slavonic Natural Languages Processing, RASLAN 2021, Karlova Studanka, Czech Republic, December 10-12, 2021. pages 167-175, Tribun EU, 2021. [doi]

Abstract

Abstract is missing.