A practical and effective sampling selection strategy for large scale deduplication

Guilherme Dal Bianco, Renata Galante, Carlos A. Heuser, Marcos André Gonçalves, Sérgio D. Canuto. A practical and effective sampling selection strategy for large scale deduplication. In 32nd IEEE International Conference on Data Engineering, ICDE 2016, Helsinki, Finland, May 16-20, 2016. pages 1518-1519, IEEE Computer Society, 2016. [doi]

Abstract

Abstract is missing.