Evaluation of Document Deduplication Algorithms for Large Text Corpora

Johannes Leveling, Lennard Helmer, Benny Jörg Stein, Dennis Wegener, Zoha Sheikh, Elanton Fernandes, Hammam Abdelwahab. Evaluation of Document Deduplication Algorithms for Large Text Corpora. In Giuseppe Nicosia, Varun Ojha 0001, Sven Giesselbach, Panos M. Pardalos, Renato Umeton, editors, Machine Learning, Optimization, and Data Science - 10th International Conference, LOD 2024, Castiglione della Pescaia, Italy, September 22-25, 2024, Revised Selected Papers, Part I. Volume 15508 of Lecture Notes in Computer Science, pages 390-404, Springer, 2024. [doi]

Authors

Johannes Leveling

This author has not been identified. Look up 'Johannes Leveling' in Google

Lennard Helmer

This author has not been identified. Look up 'Lennard Helmer' in Google

Benny Jörg Stein

This author has not been identified. Look up 'Benny Jörg Stein' in Google

Dennis Wegener

This author has not been identified. Look up 'Dennis Wegener' in Google

Zoha Sheikh

This author has not been identified. Look up 'Zoha Sheikh' in Google

Elanton Fernandes

This author has not been identified. Look up 'Elanton Fernandes' in Google

Hammam Abdelwahab

This author has not been identified. Look up 'Hammam Abdelwahab' in Google