Bridging the gap between intensional and extensional query evaluation in probabilistic databases

Abhay Jha, Dan Olteanu, Dan Suciu. Bridging the gap between intensional and extensional query evaluation in probabilistic databases. In Ioana Manolescu, Stefano Spaccapietra, Jens Teubner, Masaru Kitsuregawa, Alain Léger, Felix Naumann, Anastasia Ailamaki, Fatma Özcan, editors, EDBT 2010, 13th International Conference on Extending Database Technology, Lausanne, Switzerland, March 22-26, 2010, Proceedings. Volume 426 of ACM International Conference Proceeding Series, pages 323-334, ACM, 2010. [doi]

Abstract

There are two broad approaches to query evaluation over probabilistic databases: (1) Intensional Methods proceed by manipulating expressions over symbolic events associated with uncertain tuples. This approach is very general and can be applied to any query, but requires an expensive postprocessing phase, which involves some general-purpose probabilistic inference. (2) Extensional Methods, on the other hand, evaluate the query by translating operations over symbolic events to a query plan; extensional methods scale well, but they are restricted to safe queries.

In this paper, we bridge this gap by proposing an approach that can translate the evaluation of any query into extensional operators, followed by some post-processing that requires probabilistic inference. Our approach uses characteristics of the data to adapt smoothly between the two evaluation strategies. If the query is safe or becomes safe because of the data instance, then the evaluation is completely extensional and inside the database. If the query/data combination departs from the ideal setting of a safe query, then some intensional processing is performed, whose complexity depends only on the distance from the ideal setting.