Pathfinder: XQuery Compilation Techniques for Relational Database Targets

Jens Teubner. Pathfinder: XQuery Compilation Techniques for Relational Database Targets. In Alfons Kemper, Harald Schöning, Thomas Rose, Matthias Jarke, Thomas Seidl, Christoph Quix, Christoph Brochhaus, editors, Datenbanksysteme in Business, Technologie und Web (BTW 2007), 12. Fachtagung des GI-Fachbereichs Datenbanken und Informationssysteme (DBIS), Proceedings, 7.-9. März 2007, Aachen, Germany. Volume 103 of LNI, pages 465-474, GI, 2007. [doi]

Abstract

Relational database systems are highly efficient hosts to table-shaped data. It is all the more interesting to see how a careful inspection of both, the XML tree structure as well as the W3C XQuery language definition, can turn relational databases into fast and scalable XML processors.

This work shows how the deliberate choice of a relational tree encoding makes the XML data model―ordered, unranked trees―accessible to relational database systems. Efficient XPath-based access to these data is enabled in terms of staircase join, a join operator that injects full tree awareness into the relational database kernel. A loop-lifting compiler translates XQuery expressions into purely algebraic query plans. The representation of iteration (i.e., the XQuery FLWOR construct) in terms of set-oriented algebra primitives forms the core of this compiler. Together, the techniques we describe lead to unprecedented XQuery evaluation scalability in the multi-gigabyte XML range. Pathfinder is an open-source implementation of a purely relational XQuery processor.