Suffix tree construction algorithms on modern hardware

Dimitris Tsirogiannis, Nick Koudas. Suffix tree construction algorithms on modern hardware. In Ioana Manolescu, Stefano Spaccapietra, Jens Teubner, Masaru Kitsuregawa, Alain Léger, Felix Naumann, Anastasia Ailamaki, Fatma Özcan, editors, EDBT 2010, 13th International Conference on Extending Database Technology, Lausanne, Switzerland, March 22-26, 2010, Proceedings. Volume 426 of ACM International Conference Proceeding Series, pages 263-274, ACM, 2010. [doi]

Abstract

Suffix trees are indexing structures that enhance the performance of numerous string processing algorithms. In this paper, we propose cache-conscious suffix tree construction algorithms that are tailored to CMP architectures. The proposed algorithms utilize a novel sample-based cache partitioning algorithm to improve cache performance and exploit on-chip parallelism on CMPs. Furthermore, several compression techniques are applied to effectively trade space for cache performance.

Through an extensive experimental evaluation using real text data from different domains, we demonstrate that the algorithms proposed herein exhibit better cache performance than their cache-unaware counterparts and effectively utilize all processing elements, achieving satisfactory speedup.