A Spinning Join That Does Not Get Dizzy

Philip Werner Frey, Romulo Goncalves, Martin L. Kersten, Jens Teubner. A Spinning Join That Does Not Get Dizzy. In 2010 International Conference on Distributed Computing Systems, ICDCS 2010, Genova, Italy, June 21-25, 2010. pages 283-292, IEEE Computer Society, 2010. [doi]

Abstract

As network infrastructures with 10 Gb/s bandwidth and beyond have become pervasive and as cost advantages of large commodity-machine clusters continue to increase, research and industry strive to exploit the available processing performance for large-scale database processing tasks.

In this work we look at the use of high-speed networks for distributed join processing. We propose Data Roundabout as a lightweight transport layer that uses Remote Direct Memory Access (RDMA) to gain access to the throughput opportunities in modern networks. The essence of Data Roundabout is a ring-shaped network in which each host stores one portion of a large database instance. We leverage the available bandwidth to (continuously) pump data through the high-speed network.

Based on Data Roundabout, we demonstrate cyclo-join, which exploits the cycling flow of data to execute distributed joins. The study uses different join algorithms (hash join and sort-merge join) to expose the pitfalls and the advantages of each algorithm in the data cycling arena. The experiments show the potential of a large distributed main-memory cache glued together with RDMA into a novel distributed database architecture.