SpotSigs: robust and efficient near duplicate detection in large web collections

Martin Theobald, Jonathan Siddharth, Andreas Paepcke. SpotSigs: robust and efficient near duplicate detection in large web collections. In Sung-Hyon Myaeng, Douglas W. Oard, Fabrizio Sebastiani, Tat-Seng Chua, Mun-Kew Leong, editors, Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2008, Singapore, July 20-24, 2008. pages 563-570, ACM, 2008. [doi]

Abstract

Abstract is missing.