Schema Mediation for Large-Scale Semantic Data Sharing

Loading...
Thumbnail Image
Penn collection
Departmental Papers (CIS)
Degree type
Discipline
Subject
peer data management
data integration
schema mediation
web and databases
Funder
Grant number
License
Copyright date
Distributor
Related resources
Author
Halevy, Alon Y
Suciu, Dan
Tatarinov, Igor
Contributor
Abstract

Intuitively, data management and data integration tools should be well suited for exchanging information in a semantically meaningful way. Unfortunately, they suffer from two significant problems: they typically require a common and comprehensive schema design before they can be used to store or share information, and they are difficult to extend because schema evolution is heavyweight and may break backward compatibility. As a result, many large-scale data sharing tasks are more easily facilitated by non-database-oriented tools that have little support for semantics. The goal of the peer data management system (PDMS) is to address this need: we propose the use of a decentralized, easily extensible data management architecture in which any user can contribute new data, schema information, or even mappings between other peers schemas. PDMSs represent a natural step beyond data integration systems, replacing their single logical schema with an interlinked collection of semantic mappings between peers individual schemas. This paper considers the problem of schema mediation in a PDMS. Our first contribution is a flexible language for mediating between peer schemas that extends known data integration formalisms to our more complex architecture. We precisely characterize the complexity of query answering for our language. Next, we describe a reformulation algorithm for our language that generalizes both global-as-view and local-as-view query answering algorithms. Then we describe several methods for optimizing the reformulation algorithm and an initial set of experiments studying its performance. Finally, we define and consider several global problems in managing semantic mappings in a PDMS.

Advisor
Date Range for Data Collection (Start Date)
Date Range for Data Collection (End Date)
Digital Object Identifier
Series name and number
Publication date
2005-03-01
Journal title
Volume number
Issue number
Publisher
Publisher DOI
Journal Issue
Comments
Postprint version. Published in VLDB Journal : the International Journal on Very Large Data Bases, Volume 14, Issue 1, March 2005, pages 68-83. The original publication is available at www.springerlink.com. Publisher URL: http://dx.doi.org/10.1007/s00778-003-0116-y
Postprint version. Published in The International Journal on Very Large Data Bases, Volume 14, Issue 1, March 2005, pages 68-83. The original publication is available at www.springerlink.com. Publisher URL: http://dx.doi.org/10.1007/s00778-003-0116-y
Recommended citation
Collection