Lost source provenance

Jing Zhang, H. V. Jagadish. Lost source provenance. In Ioana Manolescu, Stefano Spaccapietra, Jens Teubner, Masaru Kitsuregawa, Alain Léger, Felix Naumann, Anastasia Ailamaki, Fatma Özcan, editors, EDBT 2010, 13th International Conference on Extending Database Technology, Lausanne, Switzerland, March 22-26, 2010, Proceedings. Volume 426 of ACM International Conference Proceeding Series, pages 311-322, ACM, 2010. [doi]

Abstract

As the use of derived information has grown in recent years, the importance of provenance has been recognized, and there has been a great deal of effort devoted to developing techniques to identify individual source tuples used in the derivation of any result tuple. Often, however, the source database may have been updated since the result was derived, and the source tuples of interest are not in the database any more. In such situations, the provenance management system has to reconstruct relevant historical fragments of the source database as they were at derivation time. In this paper, we develop techniques to address this problem. Our experimental assessment shows that these techniques do so efficiently, and with low storage overhead.