An XML Query Engine for Network-Bound Data

dc.contributor.authorIves, Zachary G
dc.contributor.authorHalevy, Alon Y
dc.contributor.authorWeld, Daniel S
dc.date2023-05-16T22:26:06.000
dc.date.accessioned2023-05-22T12:46:03Z
dc.date.available2023-05-22T12:46:03Z
dc.date.issued2002-12-01
dc.date.submitted2005-04-04T12:58:43-07:00
dc.description.abstractXML has become the lingua franca for data exchange and integration across administrative and enterprise boundaries. Nearly all data providers are adding XML import or export capabilities, and standard XML Schemas and DTDs are being promoted for all types of data sharing. The ubiquity of XML has removed one of the major obstacles to integrating data from widely disparate sources –- namely, the heterogeneity of data formats. However, general-purpose integration of data across the wide area also requires a query processor that can query data sources on demand, receive streamed XML data from them, and combine and restructure the data into new XML output -- while providing good performance for both batch-oriented and ad-hoc, interactive queries. This is the goal of the Tukwila data integration system, the first system that focuses on network-bound, dynamic XML data sources. In contrast to previous approaches, which must read, parse, and often store entire XML objects before querying them, Tukwila can return query results even as the data is streaming into the system. Tukwila is built with a new system architecture that extends adaptive query processing and relational-engine techniques into the XML realm, as facilitated by a pair of operators that incrementally evaluate a query’s input path expressions as data is read. In this paper, we describe the Tukwila architecture and its novel aspects, and we experimentally demonstrate that Tukwila provides better overall query performance and faster initial answers than existing systems, and has excellent scalability.
dc.description.commentsPostprint version. Published in VLDB Journal : The International Journal on Very Large Data Bases, Volume 11, Number 4, December 2002, pages 380-402. The original publication is available at www.springerlink.com. Publisher URL: http://dx.doi.org/10.1007/s00778-002-0078-5
dc.identifier.urihttps://repository.upenn.edu/handle/20.500.14332/6147
dc.legacy.articleid1117
dc.legacy.fulltexturlhttps://repository.upenn.edu/cgi/viewcontent.cgi?article=1117&context=cis_papers&unstamped=1
dc.source.issue121
dc.source.journalDepartmental Papers (CIS)
dc.source.peerreviewedtrue
dc.source.statuspublished
dc.subject.otherXML
dc.subject.otherquery processing
dc.subject.otherdata streams
dc.subject.otherdata integration
dc.subject.otherweb and databases
dc.titleAn XML Query Engine for Network-Bound Data
dc.typeArticle
digcom.contributor.authorisAuthorOfPublication|email:zives@cis.upenn.edu|institution:University of Pennsylvania|Ives, Zachary G
digcom.contributor.authorHalevy, Alon Y
digcom.contributor.authorWeld, Daniel S
digcom.identifiercis_papers/121
digcom.identifier.contextkey54652
digcom.identifier.submissionpathcis_papers/121
digcom.typearticle
dspace.entity.typePublication
relation.isAuthorOfPublication2ed74aa5-1c6d-4c69-8716-f2134575f50c
relation.isAuthorOfPublication.latestForDiscovery2ed74aa5-1c6d-4c69-8716-f2134575f50c
upenn.schoolDepartmentCenterDepartmental Papers (CIS)
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
d18.pdf
Size:
318.84 KB
Format:
Adobe Portable Document Format
Collection