Removing Spam from Web Corpora Through Supervised Learning and Semi-manual Classification of Web Sites

Vít Suchomel. Removing Spam from Web Corpora Through Supervised Learning and Semi-manual Classification of Web Sites. In Ales Horák, Pavel Rychlý, Adam Rambousek, editors, The 14th Workshop on Recent Advances in Slavonic Natural Languages Processing, RASLAN 2020, Brno (on-line), Czech Republic, December 8-10, 2020. pages 113-123, Tribun EU, 2020. [doi]

Abstract

Abstract is missing.