Text Similarity Measures in a Data Deduplication Pipeline for Customers Records

Witold Andrzejewski, Bartosz Bebel, Pawel Boinski, Mariusz Sienkiewicz, Robert Wrembel. Text Similarity Measures in a Data Deduplication Pipeline for Customers Records. In Enrico Gallinucci, Lukasz Golab, editors, Proceedings of the 25th International Workshop on Design, Optimization, Languages and Analytical Processing of Big Data (DOLAP) co-located with the 26th International Conference on Extending Database Technology and the 26th International Conference on Database Theory (EDBT/ICDT 2023), Ioannina, Greece, March 28, 2023. Volume 3369 of CEUR Workshop Proceedings, pages 33-42, CEUR-WS.org, 2023. [doi]

Authors

Witold Andrzejewski

This author has not been identified. It may be one of the following persons: Look up 'Witold Andrzejewski' in Google

Bartosz Bebel

This author has not been identified. Look up 'Bartosz Bebel' in Google

Pawel Boinski

This author has not been identified. Look up 'Pawel Boinski' in Google

Mariusz Sienkiewicz

This author has not been identified. Look up 'Mariusz Sienkiewicz' in Google

Robert Wrembel

This author has not been identified. Look up 'Robert Wrembel' in Google