Integrating Network-Bound XML Data

Loading...
Thumbnail Image
Penn collection
Departmental Papers (CIS)
Degree type
Discipline
Subject
Funder
Grant number
License
Copyright date
Distributor
Related resources
Author
Halevy, Alon Y
Weld, Daniel S
Contributor
Abstract

Although XML was originally envisioned as a replacement for HTML on the web, to this point it has instead been used primarily as a format for on-demand interchange of data between applications and enterprises. The web is rather sparsely populated with static XML documents, but nearly every data management application today can export XML data. There is great interest in integrating such exported data across applications and administrative boundaries, and as a result, efficient techniques for integrating XML data across local- and wide-area networks are an important research focus. In this paper, we provide an overview of the Tukwila data integration system, which is based on the first XML query engine designed specifically for processing network-bound XML data sources. In contrast to previous approaches, which must read, parse, and often store XML data before querying it, the Tukwila XML engine can return query results even as the data is streaming into the system. Tukwila features a new system architecture that extends relational query processing techniques, such as pipelining and adaptive query processing, into the XML realm. We compare the focus of the Tukwila project to that of other XML research systems, and then we present our system architecture and novel query operators, such as the x-scan operator. We conclude with a description of our current research directions in extending XML-based adaptive query processing.

Advisor
Date Range for Data Collection (Start Date)
Date Range for Data Collection (End Date)
Digital Object Identifier
Series name and number
Publication date
2001-06-01
Journal title
Volume number
Issue number
Publisher
Publisher DOI
Journal Issue
Comments
Copyright 2001 IEEE. Reprinted from Bulletin of the Technical Committee on Data Engineering, IEEE Computer Society, Volume 24, Issue 2, June 2001, pages 20-26. This material is posted here with permission of the IEEE. Such permission of the IEEE does not in any way imply IEEE endorsement of any of the University of Pennsylvania's products or services. Internal or personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution must be obtained from the IEEE by writing to pubs-permissions@ieee.org. By choosing to view this document, you agree to all provisions of the copyright laws protecting it. NOTE: At the time of publication, author Zachary Ives was affiliated with the University of Washington. Currently (April 2005), he is a faculty member in the Department of Computer and Information Science at the University of Pennsylvania.
Copyright 2001 IEEE. Reprinted from Bulletin of the Technical Committee on Data Engineering, IEEE Computer Society, Volume 24, Issue 2, June 2001, pages 20-26. This material is posted here with permission of the IEEE. Such permission of the IEEE does not in any way imply IEEE endorsement of any of the University of Pennsylvania's products or services. Internal or personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution must be obtained from the IEEE by writing to pubs-permissions@ieee.org. By choosing to view this document, you agree to all provisions of the copyright laws protecting it. NOTE: At the time of publication, author Zachary Ives was affiliated with the University of Washington. Currently (April 2005), he is a faculty member in the Department of Computer and Information Science at the University of Pennsylvania.
Recommended citation
Collection