publications: - title: "A SQL: 1999 code generator for the pathfinder xquery compiler" author: - name: "Torsten Grust" link: "http://www-db.informatik.uni-tuebingen.de/team/grust" - name: "Manuel Mayr" link: "http://www-db.informatik.uni-tuebingen.de/team/mayr" - name: "Jan Rittinger" link: "http://www-db.informatik.uni-tuebingen.de/team/rittinger" - name: "Sherif Sakr" link: "https://researchr.org/alias/sherif-sakr" - name: "Jens Teubner" link: "http://people.inf.ethz.ch/jteubner/" year: "2007" doi: "http://doi.acm.org/10.1145/1247480.1247642" abstract: "The Pathfinder XQuery compiler has been enhanced by a new code generator that can target any SQL:1999-compliant relational database system (RDBMS). This code generator marks an important next step towards truly relational XQuery processing, a branch of database technology that aims to turn RDBMSs into highly efficient XML and XQuery processors without the need to invade the relational database kernel. Pathfinder, a retargetable front-end compiler, translates input XQuery expressions into DAG-shaped relational algebra plans. The code generator then turns these plans into sequences of either SQL:1999 statements or view definitions which jointly implement the (sometimes intricate) XQuery semantics. In a sense, this demonstration thus lets relational algebra and SQL swap their traditional roles in database query processing. The result is a code generator that (1) supports an almost complete dialect of XQuery, (2) can target any RDBMS with a SQL:1999 language interface, and (3) exhibits quite promising performance characteristics when run against high-volume XML data as well as complex XQuery expressions." links: doi: "http://doi.acm.org/10.1145/1247480.1247642" tags: - "semantics" - "XQuery" - "translation" - "completeness" - "data-flow language" - "relational database" - "XML" - "XML Schema" - "SQL" - "process algebra" - "relational algebra" - "data-flow" - "compiler" - " algebra" - "database" - "query language" researchr: "https://researchr.org/publication/GrustMRST07" cites: 8 citedby: 1 pages: "1162-1164" booktitle: "SIGMOD" kind: "inproceedings" key: "GrustMRST07" - title: "MonetDB/XQuery: a fast XQuery processor powered by a relational engine" author: - name: "Peter A. Boncz" link: "http://homepages.cwi.nl/~boncz/" - name: "Torsten Grust" link: "http://www-db.informatik.uni-tuebingen.de/team/grust" - name: "Maurice van Keulen" link: "http://www.vf.utwente.nl/~keulen/" - name: "Stefan Manegold" link: "http://homepages.cwi.nl/~manegold/" - name: "Jan Rittinger" link: "http://www-db.informatik.uni-tuebingen.de/team/rittinger" - name: "Jens Teubner" link: "http://people.inf.ethz.ch/jteubner/" year: "2006" doi: "http://doi.acm.org/10.1145/1142473.1142527" abstract: "Relational XQuery systems try to re-use mature relational data management infrastructures to create fast and scalable XML database technology. This paper describes the main features, key contributions, and lessons learned while implementing such a system. Its architecture consists of (i) a range-based encoding of XML documents into relational tables, (ii) a compilation technique that translates XQuery into a basic relational algebra, (iii) a restricted (order) property-aware peephole relational query optimization strategy, and (iv) a mapping from XML update statements into relational updates. Thus, this system implements all essential XML database functionalities (rather than a single feature) such that we can learn from the full consequences of our architectural decisions. While implementing this system, we had to extend the state-of-the- art with a number of new technical contributions, such as loop-lifted staircase join and efficient relational query evaluation strategies for XQuery theta-joins with existential semantics. These contributions as well as the architectural lessons learned are also deemed valuable for other relational back-end engines. The performance and scalability of the resulting system is evaluated on the XMark benchmark up to data sizes of 11 GB. The performance section also provides an extensive comparison of all major XMark results published previously, which confirm that the goal of purely relational XQuery processing, namely speed and scalability, was met." links: doi: "http://doi.acm.org/10.1145/1142473.1142527" tags: - "optimization" - "semantics" - "relational data base" - "rule-based" - "staircase join" - "XQuery" - "translation" - "Loop Lifting" - "relational database" - "XML" - "XML Schema" - "architecture" - "process algebra" - "XPath" - "relational algebra" - "data-flow" - " algebra" - "database" - "context-aware" - "MonetDB/XQuery" researchr: "https://researchr.org/publication/BonczGKMRT06" cites: 0 citedby: 9 pages: "479-490" booktitle: "SIGMOD" kind: "inproceedings" key: "BonczGKMRT06" - title: "Pathfinder: XQuery - The Relational Way" author: - name: "Peter A. Boncz" link: "http://homepages.cwi.nl/~boncz/" - name: "Torsten Grust" link: "http://www-db.informatik.uni-tuebingen.de/team/grust" - name: "Maurice van Keulen" link: "http://www.vf.utwente.nl/~keulen/" - name: "Stefan Manegold" link: "http://homepages.cwi.nl/~manegold/" - name: "Jan Rittinger" link: "http://www-db.informatik.uni-tuebingen.de/team/rittinger" - name: "Jens Teubner" link: "http://people.inf.ethz.ch/jteubner/" year: "2005" doi: "http://www.vldb.org/conf/2005/papers/p1322-boncz.pdf" abstract: "Relational query processors are probably the best understood (as well as the best engineered) query engines available today. Although carefully tuned to process instances of the relational model (tables of tuples), these processors can also provide a foundation for the evaluation of “alien” (non-relational) query languages: if a relational encoding of the alien data model and its associated query language is given, the RDBMS may act like a special-purpose processor for the new language. This demonstration features our XQuery compiler Pathfinder, the continuation of our earlier work on a purely relational XPath and XQuery processing stack in which we developed relational encodings and processing strategies for the tree-shaped XML data model. The Pathfinder project is an exploration of how far we can push the idea of using mature RDBMS technology to design and build a full-fledged XQuery implementation. The demonstration will show that this line of research was and still is worth to be followed: based on the extensible relational database kernel MonetDB, Pathfinder provides highly efficient and scalable XQuery technology that scales beyond 10 GB XML input instances on commodity hardware. Pathfinder requires only local extensions to the underlying DBMS’s kernel, such as the staircase join operator. A join recognition logic in our compiler, as well as a careful consideration of order properties of relational operators, allow for effective optimizations that turn MonetDB into a highly efficient XQuery engine." links: doi: "http://www.vldb.org/conf/2005/papers/p1322-boncz.pdf" tags: - "optimization" - "relational data base" - "rule-based" - "XQuery" - "data-flow language" - "relational database" - "meta-model" - "XML" - "modeling language" - "XML Schema" - "language modeling" - "design research" - "XPath" - "language design" - "data-flow" - "compiler" - "database" - "logic" - "Meta-Environment" - "design" - "process modeling" - "extensible language" - "query language" researchr: "https://researchr.org/publication/BonczGKMRT05" cites: 10 citedby: 1 pages: "1322-1325" booktitle: "VLDB" kind: "inproceedings" key: "BonczGKMRT05" - title: "Why off-the-shelf RDBMSs are better at XPath than you might expect" author: - name: "Torsten Grust" link: "http://www-db.informatik.uni-tuebingen.de/team/grust" - name: "Jan Rittinger" link: "http://www-db.informatik.uni-tuebingen.de/team/rittinger" - name: "Jens Teubner" link: "http://people.inf.ethz.ch/jteubner/" year: "2007" doi: "http://doi.acm.org/10.1145/1247480.1247591" abstract: "To compensate for the inherent impedance mismatch between the relational data model (tables of tuples) and XML (ordered, unranked trees), tree join algorithms have become the prevalent means to process XML data in relational databases, most notably the TwigStack, structural join, and staircase join algorithms. However, the addition of these algorithms to existing systems depends on a significant invasion of the underlying database kernel, an option intolerable for most database vendors. Here, we demonstrate that we can achieve comparable XPath performance without touching the heart of the system. We carefully exploit existing database functionality and accelerate XPath navigation by purely relational means: partitioned B-trees bring access costs to secondary storage to a minimum, while aggregation functions avoid an expensive computation and removal of duplicate result nodes to comply with the XPath semantics. Experiments carried out on IBM DB2 confirm that our approach can turn off-the-shelf database systems into efficient XPath processors." links: doi: "http://doi.acm.org/10.1145/1247480.1247591" tags: - "semantics" - "relational database" - "meta-model" - "XML" - "XML Schema" - "Partitioned B-trees" - "XPath" - "data-flow" - "database" - "Meta-Environment" - "partitioning" - "process modeling" - "systematic-approach" researchr: "https://researchr.org/publication/GrustRT07%3A0" cites: 0 citedby: 4 pages: "949-958" booktitle: "SIGMOD" kind: "inproceedings" key: "GrustRT07:0" - title: "MonetDB/XQuery-Consistent and Efficient Updates on the Pre/Post Plane" author: - name: "Peter A. Boncz" link: "http://homepages.cwi.nl/~boncz/" - name: "Jan Flokstra" link: "http://wwwhome.cs.utwente.nl/~flokstra/" - name: "Torsten Grust" link: "http://www-db.informatik.uni-tuebingen.de/team/grust" - name: "Maurice van Keulen" link: "http://www.vf.utwente.nl/~keulen/" - name: "Stefan Manegold" link: "http://homepages.cwi.nl/~manegold/" - name: "K. Sjoerd Mullender" link: "http://homepages.cwi.nl/~sjoerd/" - name: "Jan Rittinger" link: "http://www-db.informatik.uni-tuebingen.de/team/rittinger" - name: "Jens Teubner" link: "http://people.inf.ethz.ch/jteubner/" year: "2006" doi: "http://dx.doi.org/10.1007/11687238_89" abstract: "Relational XQuery processors aim at leveraging mature relational DBMS query processing technology to provide scalability and efficiency. To achieve this goal, various storage schemes have been proposed to encode the tree structure of XML documents in flat relational tables. Basically, two classes can be identified: (1) encodings using fixed-length surrogates, like the preorder ranks in the pre/post encoding [5] or the equivalent pre/size/level encoding [8], and (2) encodings using variable-length surrogates, like, e.g., ORDPATH [9] or P-PBiTree [12]. Recent research [1] showed a clear advantage of the former for efficient evaluation of XPath location steps, exploiting techniques like cheap node order tests, positional lookup, and node skipping in staircase join [7]. However, once updates are involved, variable-length surrogates are often considered the better choice, mainly as a straightforward implementation of structural XML updates using fixed-length surrogates faces two performance bottlenecks: (i) high physical cost (the preorder ranks of all nodes following the update position must be modified—on average 50% of the document), and (ii) low transaction concurrency (updating the size of all ancestor nodes causes lock contention on the document root). In [4], we presented techniques that allow an efficient and ACID-compliant implementation of XML updates also on the pre/post (respectively pre/size/level encoding) without sacrificing its superior XPath (i.e., read-only) performance. This demonstration describes in detail, how we successfully implemented these techniques in MonetDB/XQuery [2, 1], an XML database system with full-fledged XQuery support. The system consists of the Pathfinder compiler that translates and optimizes XQuery into relational algebra [6], on top of the high-performance MonetDB relational database engine [3]." links: doi: "http://dx.doi.org/10.1007/11687238_89" tags: - "optimization" - "XQuery" - "translation" - "relational database" - "XML" - "XML Schema" - "process algebra" - "XPath" - "testing" - "relational algebra" - "compiler" - " algebra" - "database" researchr: "https://researchr.org/publication/BonczFGKMMRT06" cites: 0 citedby: 0 pages: "1190-1193" booktitle: "edbt" kind: "inproceedings" key: "BonczFGKMMRT06" - title: "Data-intensive XQuery debugging with instant replay" author: - name: "Torsten Grust" link: "http://www-db.informatik.uni-tuebingen.de/team/grust" - name: "Jan Rittinger" link: "http://www-db.informatik.uni-tuebingen.de/team/rittinger" - name: "Jens Teubner" link: "http://people.inf.ethz.ch/jteubner/" year: "2007" doi: "http://doi.acm.org/10.1145/1328158.1328162" abstract: "We explore the design and implementation of Rover, a postmortem debugger for XQuery. Rather than being based on the traditional breakpoint model, Rover acknowledges XQuery's nature as a functional language: the debugger follows a declarative debugging paradigm in which a user is enabled to observe the values of selected XQuery subexpressions. Rover has been designed to hook into Pathfinder, an XQuery compiler that emits relational algebra plans for evaluation on commodity relational database back-ends. The debugger instruments the subject query with fn:trace() calls which, at query runtime, populate database tables with relational representations of XQuery item sequences. Thanks to Pathfinder's loop-lifting compilation strategy, a Rover trace (1) may span multiple XQuery for iteration scopes and (2) allows for interactive debugging sessions that can arbitrarily replay iterations in a unique forward/backward fashion. Since the query runtime as well as the debugger are database-supported, Rover is scalable and supports the observation of very data-intensive XQuery expressions." links: doi: "http://doi.acm.org/10.1145/1328158.1328162" tags: - "relational data base" - "rule-based" - "XQuery" - "data-flow language" - "relational database" - "meta-model" - "modeling language" - "language modeling" - "relational algebra" - "language design" - "data-flow" - "debugging" - "compiler" - " algebra" - "database" - "Meta-Environment" - "design" - "query language" researchr: "https://researchr.org/publication/GrustRT07%3A1" cites: 0 citedby: 0 booktitle: "ximep" kind: "inproceedings" key: "GrustRT07:1" - title: "Pathfinder: XQuery Off the Relational Shelf" author: - name: "Torsten Grust" link: "http://www-db.informatik.uni-tuebingen.de/team/grust" - name: "Jan Rittinger" link: "http://www-db.informatik.uni-tuebingen.de/team/rittinger" - name: "Jens Teubner" link: "http://people.inf.ethz.ch/jteubner/" year: "2008" doi: "http://sites.computer.org/debull/A08dec/pathfinder.pdf" abstract: "The Pathfinder project makes inventive use of relational database technology―originally developed to process data of strictly tabular shape―to construct efficient database-supported XML and XQuery processors. Pathfinder targets database engines that implement a set-oriented mode of query execution: many off-the-shelf traditional database systems make for suitable XQuery runtime environments, but a number of off-beat storage back-ends fit that bill as well. While Pathfinder has been developed with a close eye on the XQuery semantics, some of the techniques that we will review here will be generally useful to evaluate XQuery-style iterative languages on database back-ends. " links: doi: "http://sites.computer.org/debull/A08dec/pathfinder.pdf" tags: - "semantics" - "XQuery" - "relational XQuery" - "data-flow language" - "relational database" - "XML" - "XML Schema" - "data-flow" - "reviewing" - "database" - "Meta-Environment" - "query language" researchr: "https://researchr.org/publication/GrustRT08" cites: 26 citedby: 0 journal: "DEBU" volume: "31" number: "4" pages: "7-14" kind: "article" key: "GrustRT08" - title: "Recursion in XQuery: put your distributivity safety belt on" author: - name: "Loredana Afanasiev" link: "https://researchr.org/alias/loredana-afanasiev" - name: "Torsten Grust" link: "http://www-db.informatik.uni-tuebingen.de/team/grust" - name: "Maarten Marx" link: "http://staff.science.uva.nl/~marx/" - name: "Jan Rittinger" link: "http://www-db.informatik.uni-tuebingen.de/team/rittinger" - name: "Jens Teubner" link: "http://people.inf.ethz.ch/jteubner/" year: "2009" doi: "http://doi.acm.org/10.1145/1516360.1516401" abstract: "We introduce a controlled form of recursion in XQuery, an inflationary fixed point operator, familiar from the context of relational databases. This operator imposes restrictions on the expressible types of recursion, but it is sufficiently versatile to capture a wide range of interesting use cases, including Regular XPath and its core transitive closure operator. While the optimization of general user-defined recursive functions in XQuery appears elusive, we describe how inflationary fixed points can be efficiently evaluated, provided that the recursive XQuery expressions are distributive. We test distributivity syntactically and algebraically, and provide experimental evidence that XQuery processors can benefit substantially from this mode of evaluation." links: doi: "http://doi.acm.org/10.1145/1516360.1516401" tags: - "optimization" - "XQuery" - "relational database" - "Recursion" - "process algebra" - "XPath" - "testing" - "relational algebra" - " algebra" - "context-aware" - "Fixed Point" researchr: "https://researchr.org/publication/AfanasievGMRT09" cites: 29 citedby: 1 pages: "345-356" booktitle: "edbt" kind: "inproceedings" key: "AfanasievGMRT09" - title: "Pathfinder: A Relational Query Optimizer Explores XQuery Terrain" author: - name: "Jan Rittinger" link: "http://www-db.informatik.uni-tuebingen.de/team/rittinger" - name: "Jens Teubner" link: "http://people.inf.ethz.ch/jteubner/" - name: "Torsten Grust" link: "http://www-db.informatik.uni-tuebingen.de/team/grust" year: "2007" doi: "http://www.btw2007.de/paper/p617.pdf" links: doi: "http://www.btw2007.de/paper/p617.pdf" tags: - "optimization" - "XQuery" researchr: "https://researchr.org/publication/RittingerTG07" cites: 7 citedby: 1 pages: "617-620" booktitle: "btw" kind: "inproceedings" key: "RittingerTG07" - title: "An Inflationary Fixed Point Operator in XQuery" author: - name: "Loredana Afanasiev" link: "https://researchr.org/alias/loredana-afanasiev" - name: "Torsten Grust" link: "http://www-db.informatik.uni-tuebingen.de/team/grust" - name: "Maarten Marx" link: "http://staff.science.uva.nl/~marx/" - name: "Jan Rittinger" link: "http://www-db.informatik.uni-tuebingen.de/team/rittinger" - name: "Jens Teubner" link: "http://people.inf.ethz.ch/jteubner/" year: "2008" doi: "http://dx.doi.org/10.1109/ICDE.2008.4497604" abstract: "We introduce a controlled form of recursion in XQuery, an inflationary fixed point operator, familiar from the context of relational databases. This operator imposes restrictions on the expressible types of recursion, but we show that it is sufficiently versatile to capture a wide range of interesting use cases, including Regular XPath and its core transitive closure operator. While the optimization of general user-defined recursive functions in XQuery appears elusive, we describe how inflationary fixed points can be efficiently evaluated, provided that the recursive XQuery expressions are distributive. We test distributivity syntactically and algebraically, and provide experimental evidence that XQuery processors can benefit substantially from this mode of evaluation." links: doi: "http://dx.doi.org/10.1109/ICDE.2008.4497604" successor: "https://researchr.org/publication/AfanasievGMRT09" tags: - "optimization" - "XQuery" - "relational database" - "Recursion" - "process algebra" - "XPath" - "testing" - "relational algebra" - " algebra" - "context-aware" - "Fixed Point" researchr: "https://researchr.org/publication/AfanasievGMRT08" cites: 11 citedby: 1 pages: "1504-1506" booktitle: "icde" kind: "inproceedings" key: "AfanasievGMRT08" - title: "eXrQuy: Order Indifference in XQuery" author: - name: "Torsten Grust" link: "http://www-db.informatik.uni-tuebingen.de/team/grust" - name: "Jan Rittinger" link: "http://www-db.informatik.uni-tuebingen.de/team/rittinger" - name: "Jens Teubner" link: "http://people.inf.ethz.ch/jteubner/" year: "2007" doi: "http://dx.doi.org/10.1109/ICDE.2007.367868" abstract: "There are more spots than immediately obvious in XQuery expressions where order is immaterial for evaluation―this affects most notably, but not exclusively, expressions in the scope of unordered { } and the argument of fn:unordered(). Clearly, performance gains are lurking behind such expression contexts but the prevalent impact of order on the XQuery semantics reaches deep into any compliant XQuery processor, making it non-trivial to set this potential free. Here, we describe how the relational XQuery compiler Pathfinder uniformly exploits such order indifference in a purely algebraic fashion: Pathfinder-emitted plans faithfully implement the required XQuery order semantics but (locally) ignore order wherever this is admitted." links: doi: "http://dx.doi.org/10.1109/ICDE.2007.367868" tags: - "semantics" - "XQuery" - "process algebra" - "relational algebra" - "compiler" - " algebra" - "context-aware" researchr: "https://researchr.org/publication/GrustRT07" cites: 19 citedby: 5 pages: "226-235" booktitle: "icde" kind: "inproceedings" key: "GrustRT07"