Removing Spam from Web Corpora Through Supervised Learning and Semi-manual Classification of Web Sites

Vít Suchomel. Removing Spam from Web Corpora Through Supervised Learning and Semi-manual Classification of Web Sites. In Ales Horák, Pavel Rychlý, Adam Rambousek, editors, The 14th Workshop on Recent Advances in Slavonic Natural Languages Processing, RASLAN 2020, Brno (on-line), Czech Republic, December 8-10, 2020. pages 113-123, Tribun EU, 2020. [doi]

@inproceedings{Suchomel20,
  title = {Removing Spam from Web Corpora Through Supervised Learning and Semi-manual Classification of Web Sites},
  author = {Vít Suchomel},
  year = {2020},
  url = {http://nlp.fi.muni.cz/raslan/2020/paper8.pdf},
  researchr = {https://researchr.org/publication/Suchomel20},
  cites = {0},
  citedby = {0},
  pages = {113-123},
  booktitle = {The 14th Workshop on Recent Advances in Slavonic Natural Languages Processing, RASLAN 2020, Brno (on-line), Czech Republic, December 8-10, 2020},
  editor = {Ales Horák and Pavel Rychlý and Adam Rambousek},
  publisher = {Tribun EU},
  isbn = {978-80-263-1600-8},
}