Kitten: a tool for normalizing HTML and extracting its textual content

Mathieu-Henri Falco, VĂ©ronique Moriceau, Anne Vilnat. Kitten: a tool for normalizing HTML and extracting its textual content. In Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Mehmet Ugur Dogan, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, editors, Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC-2012), Istanbul, Turkey, May 23-25, 2012. pages 2261-2267, European Language Resources Association (ELRA), 2012. [doi]

References

No references recorded for this publication.

Cited by

No citations of this publication recorded.