Rewrite techniques for performance optimization of schema matching processes

Eric Peukert, Henrike Berthold, Erhard Rahm. Rewrite techniques for performance optimization of schema matching processes. In Ioana Manolescu, Stefano Spaccapietra, Jens Teubner, Masaru Kitsuregawa, Alain Léger, Felix Naumann, Anastasia Ailamaki, Fatma Özcan, editors, EDBT 2010, 13th International Conference on Extending Database Technology, Lausanne, Switzerland, March 22-26, 2010, Proceedings. Volume 426 of ACM International Conference Proceeding Series, pages 453-464, ACM, 2010. [doi]

Abstract

A recurring manual task in data integration, ontology alignment or model management is finding mappings between complex meta data structures. In order to reduce the manual effort, many matching algorithms for semi-automatically computing mappings were introduced.

Unfortunately, current matching systems severely lack performance when matching large schemas. Recently, some systems tried to tackle the performance problem within individual matching approaches. However, none of them developed solutions on the level of matching processes.

In this paper we introduce a novel rewrite-based optimization technique that is generally applicable to different types of matching processes. We introduce filter-based rewrite rules similar to predicate push-down in query optimization. In addition we introduce a modeling tool and recommendation system for rewriting matching processes.

Our evaluation on matching large web service message types shows significant performance improvements without losing the quality of automatically computed results.