Index-based approximate XML joins

Sudipto Guha*, Nick Koudas, Divesh Srivastava, Ting Yu

*Corresponding author for this work

Research output: Contribution to conferencePaperpeer-review

9 Citations (Scopus)

Abstract

XML data integration tools are facing a variety of challenges for their efficient and effective operation. Among these is the requirement to handle a variety of inconsistencies or mistakes present in the data sets. In this paper we study the problem of integrating XML data sources through index assisted join operations, using notions of approximate match in the structure and content of XML documents as the join predicate. We show how a well known and widely deployed index structure, namely the R-tree, can be adopted to improve the performance of such operations. We propose novel search and join algorithms for R-trees adopted to index XML document collections. We also propose novel optimization objectives for R-tree construction, making R-trees better suited for this application.

Original languageEnglish
Pages708-710
Number of pages3
DOIs
Publication statusPublished - 2003
Externally publishedYes
EventNineteenth International Conference on Data Ingineering - Bangalore, India
Duration: 5 Mar 20038 Mar 2003

Conference

ConferenceNineteenth International Conference on Data Ingineering
Country/TerritoryIndia
CityBangalore
Period5/03/038/03/03

Fingerprint

Dive into the research topics of 'Index-based approximate XML joins'. Together they form a unique fingerprint.

Cite this