Pathfinder: XQuery - The Relational Way

Peter A. Boncz, Torsten Grust, Maurice van Keulen, Stefan Manegold, Jan Rittinger, Jens Teubner. Pathfinder: XQuery - The Relational Way. In Klemens Böhm, Christian S. Jensen, Laura M. Haas, Martin L. Kersten, Per-Åke Larson, Beng Chin Ooi, editors, Proceedings of the 31st International Conference on Very Large Data Bases, Trondheim, Norway, August 30 - September 2, 2005. pages 1322-1325, ACM, 2005. [doi]


Relational query processors are probably the best understood (as well as the best engineered) query engines available today. Although carefully tuned to process instances of the relational model (tables of tuples), these processors can also provide a foundation for the evaluation of “alien” (non-relational) query languages: if a relational encoding of the alien data model and its associated query language is given, the RDBMS may act like a special-purpose processor for the new language.

This demonstration features our XQuery compiler Pathfinder, the continuation of our earlier work on a purely relational XPath and XQuery processing stack in which we developed relational encodings and processing strategies for the tree-shaped XML data model. The Pathfinder project is an exploration of how far we can push the idea of using mature RDBMS technology to design and build a full-fledged XQuery implementation. The demonstration will show that this line of research was and still is worth to be followed: based on the extensible relational database kernel MonetDB, Pathfinder provides highly efficient and scalable XQuery technology that scales beyond 10 GB XML input instances on commodity hardware.

Pathfinder requires only local extensions to the underlying DBMS’s kernel, such as the staircase join operator. A join recognition logic in our compiler, as well as a careful consideration of order properties of relational operators, allow for effective optimizations that turn MonetDB into a highly efficient XQuery engine.