Reconciling while Tolerating Disagreement in Collaborative Data Sharing

Loading...
Thumbnail Image
Penn collection
Database Research Group (CIS)
Degree type
Discipline
Subject
databases
data integration
data sharing
peer-to-peer systems
collaborative data sharing
orchestra
reconciliation
transactions
updates
Funder
Grant number
License
Copyright date
Distributor
Related resources
Contributor
Abstract

In many data sharing settings, such as within the biological and biomedical communities, global data consistency is not always attainable: different sites' data may be dirty, uncertain, or even controversial. Collaborators are willing to share their data, and in many cases they also want to selectively import data from others - but must occasionally diverge when they disagree about uncertain or controversial facts or values. For this reason, traditional data sharing and data integration approaches are not applicable, since they require a globally \emph{consistent} data instance. Additionally, many of these approaches do not allow participants to make updates; if they do, concurrency control algorithms or inconsistency repair techniques must be used to ensure a consistent view of the data for all users. In this paper, we develop and present a fully decentralized model of collaborative data sharing, in which participants publish their data on an ad hoc basis and simultaneously reconcile updates with those published by others. Individual updates are associated with provenance information, and each participant accepts only updates with a sufficient authority ranking, meaning that each participant may have a different (though conceptually overlapping) data instance. We define a consistency semantics for database instances under this model of disagreement, present algorithms that perform reconciliation for distributed clusters of participants, and demonstrate their ability to handle typical update and conflict loads in settings involving the sharing of curated data.

Advisor
Date of presentation
2006-06-27
Conference name
Database Research Group (CIS)
Conference dates
2023-05-17T00:26:24.000
Conference location
Date Range for Data Collection (Start Date)
Date Range for Data Collection (End Date)
Digital Object Identifier
Series name and number
Volume number
Issue number
Publisher
Publisher DOI
Journal Issue
Comments
Postprint version. Copyright ACM 2006. This is the author's version of the work. It is posted here by permissino of ACM for your personal use. Not for redistribution. The definitive version was published in Proceedings of the 2006 ACM SIGMOD International Conference on Management of Data, pages 13-24. Publisher URL: http://doi.acm.org/10.1145/1142473.1142476
Recommended citation
Collection