Why and Where: A Characterization of Data Provenance

Loading...
Thumbnail Image
Penn collection
Departmental Papers (CIS)
Degree type
Discipline
Subject
Funder
Grant number
License
Copyright date
Distributor
Related resources
Author
Buneman, Peter
Tan, Wang-Chiew
Contributor
Abstract

With the proliferation of database views and curated databases, the issue of data provenance - where a piece of data came from and the process by which it arrived in the database - is becoming increasingly important, especially in scientific databases where understanding provenance is crucial to the accuracy and currency of data. In this paper we describe an approach to computing provenance when the data of interest has been created by a database query. We adopt a syntactic approach and present results for a general data model that applies to relational databases as well as to hierarchical data such as XML. A novel aspect of our work is a distinction between "why" provenance (refers to the source data that had some influence on the existence of the data) and "where" provenance (refers to the location(s) in the source databases from which the data was extracted).

Advisor
Date of presentation
2001-01-01
Conference name
Departmental Papers (CIS)
Conference dates
2023-05-16T22:32:38.000
Conference location
Date Range for Data Collection (Start Date)
Date Range for Data Collection (End Date)
Digital Object Identifier
Series name and number
Volume number
Issue number
Publisher
Publisher DOI
Journal Issue
Comments
Postprint version. Published in Lecture Notes in Computer Science, Volume 1973, International Conference on Database Theory (ICDT 2001), pages 316-330. Publisher URL: http://www.springerlink.com/link.asp?id=edf0k68ccw3a22hu
Postprint version. Published in Lecture Notes in Computer Science, Volume 1973, Database Theory, 2001, pages 316-330. Publisher URL: http://www.springerlink.com/link.asp?id=edf0k68ccw3a22hu
Recommended citation
Collection