An XML Query Engine for Network-Bound Data

Ives, Zachary G; Halevy, Alon Y; Weld, Daniel S

An XML Query Engine for Network-Bound Data

dc.contributor.author	Ives, Zachary G
dc.contributor.author	Halevy, Alon Y
dc.contributor.author	Weld, Daniel S
dc.date	2023-05-16T22:26:06.000
dc.date.accessioned	2023-05-22T12:46:03Z
dc.date.available	2023-05-22T12:46:03Z
dc.date.issued	2002-12-01
dc.date.submitted	2005-04-04T12:58:43-07:00
dc.description.abstract	XML has become the lingua franca for data exchange and integration across administrative and enterprise boundaries. Nearly all data providers are adding XML import or export capabilities, and standard XML Schemas and DTDs are being promoted for all types of data sharing. The ubiquity of XML has removed one of the major obstacles to integrating data from widely disparate sources –- namely, the heterogeneity of data formats. However, general-purpose integration of data across the wide area also requires a query processor that can query data sources on demand, receive streamed XML data from them, and combine and restructure the data into new XML output -- while providing good performance for both batch-oriented and ad-hoc, interactive queries. This is the goal of the Tukwila data integration system, the first system that focuses on network-bound, dynamic XML data sources. In contrast to previous approaches, which must read, parse, and often store entire XML objects before querying them, Tukwila can return query results even as the data is streaming into the system. Tukwila is built with a new system architecture that extends adaptive query processing and relational-engine techniques into the XML realm, as facilitated by a pair of operators that incrementally evaluate a query’s input path expressions as data is read. In this paper, we describe the Tukwila architecture and its novel aspects, and we experimentally demonstrate that Tukwila provides better overall query performance and faster initial answers than existing systems, and has excellent scalability.
dc.description.comments	Postprint version. Published in VLDB Journal : The International Journal on Very Large Data Bases, Volume 11, Number 4, December 2002, pages 380-402. The original publication is available at www.springerlink.com. Publisher URL: http://dx.doi.org/10.1007/s00778-002-0078-5
dc.identifier.uri	https://repository.upenn.edu/handle/20.500.14332/6147
dc.legacy.articleid	1117
dc.legacy.fulltexturl	https://repository.upenn.edu/cgi/viewcontent.cgi?article=1117&context=cis_papers&unstamped=1
dc.source.issue	121
dc.source.journal	Departmental Papers (CIS)
dc.source.peerreviewed	true
dc.source.status	published
dc.subject.other	XML
dc.subject.other	query processing
dc.subject.other	data streams
dc.subject.other	data integration
dc.subject.other	web and databases
dc.title	An XML Query Engine for Network-Bound Data
dc.type	Article
digcom.contributor.author	isAuthorOfPublication\|email:zives@cis.upenn.edu\|institution:University of Pennsylvania\|Ives, Zachary G
digcom.contributor.author	Halevy, Alon Y
digcom.contributor.author	Weld, Daniel S
digcom.identifier	cis_papers/121
digcom.identifier.contextkey	54652
digcom.identifier.submissionpath	cis_papers/121
digcom.type	article
dspace.entity.type	Publication
relation.isAuthorOfPublication	2ed74aa5-1c6d-4c69-8716-f2134575f50c
relation.isAuthorOfPublication.latestForDiscovery	2ed74aa5-1c6d-4c69-8716-f2134575f50c
upenn.schoolDepartmentCenter	Departmental Papers (CIS)

Files

Original bundle

Now showing 1 - 1 of 1

Name:: d18.pdf
Size:: 318.84 KB
Format:: Adobe Portable Document Format

Download

Collection

Articles

An XML Query Engine for Network-Bound Data

Files

Original bundle

Collection

Usage statistics

Penn's Heritage