Massively scalable near duplicate detection in streams of documents using MDSH

Paul Logasa Bogen, Christopher T. Symons, Amber McKenzie, Robert M. Patton, Robert E. Gillen. Massively scalable near duplicate detection in streams of documents using MDSH. In Xiaohua Hu, Tsau Young Lin, Vijay Raghavan, Benjamin W. Wah, Ricardo A. Baeza-Yates, Geoffrey Fox, Cyrus Shahabi, Matthew Smith, Qiang Yang 0001, Rayid Ghani, Wei Fan, Ronny Lempel, Raghunath Nambiar, editors, Proceedings of the 2013 IEEE International Conference on Big Data, 6-9 October 2013, Santa Clara, CA, USA. pages 480-486, IEEE, 2013. [doi]

Abstract

Abstract is missing.