On Provenance Minimization

Loading...
Thumbnail Image
Penn collection
Departmental Papers (CIS)
Degree type
Discipline
Subject
Computer Sciences
Funder
Grant number
License
Copyright date
Distributor
Related resources
Author
Amsterdamer, Yael
Deutch, Daniel
Milo, Tova
Contributor
Abstract

Provenance information has been proved to be very effective in capturing the computational process performed by queries, and has been used extensively as the input to many advanced data management tools (e.g. view maintenance, trust assessment, or query answering in probabilistic databases). We study here the core of provenance information, namely the part of provenance that appears in the computation of every query equivalent to the given one. This provenance core is informative as it describes the part of the computational process that is inherent to the query. It is also useful as a compact input to the above mentioned data management tools. We study algorithms that, given a query, compute an equivalent query that realizes the core provenance for all tuples in its result. We study these algorithms for queries of varying expressive power. Finally, we observe that, in general, one would not want to require database systems to evaluate a specific query that realizes the core provenance, but instead to be able to find, possibly off-line, the core provenance of a given tuple in the output (computed by an arbitrary equivalent query), without rewriting the query. We provide algorithms for such direct computation of the core provenance.

Advisor
Date of presentation
2011-06-13
Conference name
Departmental Papers (CIS)
Conference dates
2023-05-17T07:14:16.000
Conference location
Date Range for Data Collection (Start Date)
Date Range for Data Collection (End Date)
Digital Object Identifier
Series name and number
Volume number
Issue number
Publisher
Publisher DOI
Journal Issue
Comments
Yael Amsterdamer, Daniel Deutch, Tova Milo, and Val Tannen. 2011. On provenance minimization. In Proceedings of the thirtieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems (PODS '11). ACM, New York, NY, USA, 141-152. DOI=10.1145/1989284.1989303 http://doi.acm.org/10.1145/1989284.1989303 © ACM, 2011. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in Proceedings of the thirtieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems , {(2011)} http://dx.doi.org/10.1145/1989284.1989303 Email permissions@acm.org
Recommended citation
Collection