MapDupReducer: detecting near duplicates over massive datasets

Chaokun Wang, Jianmin Wang, Xuemin Lin, Wei Wang, Haixun Wang, Hongsong Li, Wanpeng Tian, Jun Xu, Rui Li. MapDupReducer: detecting near duplicates over massive datasets. In Ahmed K. Elmagarmid, Divyakant Agrawal, editors, Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2010, Indianapolis, Indiana, USA, June 6-10, 2010. pages 1119-1122, ACM, 2010. [doi]

Abstract

Abstract is missing.