
Database Research Group (CIS)
Document Type
Conference Paper
Date of this Version
June 2008
Abstract
We present a formal framework for capturing the provenance of data appearing in XQuery views of XML. Building on previous work on relations and their (positive) query languages, we decorate unordered XML with annotations from commutative semirings and show that these annotations suffice for a large positive fragment of XQuery applied to this data. In addition to tracking provenance metadata, the framework can be used to represent and process XML with repetitions, incomplete XML, and probabilistic XML, and provides a basis for enforcing access control policies in security applications.
Each of these applications builds on our semantics for XQuery, which we present in several steps: we generalize the semantics of the Nested Relational Calculus (NRC) to handle semiring-annotated complex values, we extend it with a recursive type and structural recursion operator for trees, and we define a semantics for XQuery on annotated XML by translation into this calculus.
Keywords
Data provenance, semirings, complex values, XML, XQuery
Date Posted: 11 July 2008

Comments
Postprint version. Presented at Proceedings of the Twenty-Seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of Database Systems.
Publisher URL: http://doi.acm.org/10.1145/1376916.1376954