Database Research Group (CIS)

Document Type

Conference Paper

Date of this Version

June 2007


Postprint version. Copyright ACM, 2007. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in Proceedings of ACM Symposium on Principles of Database Systems 2007, June 2007, 10 pages.
Publisher URL:


We show that relational algebra calculations for incomplete databases, probabilistic databases, bag semantics and why provenance are particular cases of the same general algorithms involving semirings. This further suggests a comprehensive provenance representation that uses semirings of polynomials. We extend these considerations to datalog and semirings of formal power series. We give algorithms for datalog provenance calculation as well as datalog evaluation for incomplete and probabilistic databases. Finally, we show that for some semirings containment of conjunctive queries is the same as for standard set semantics.


Data provenance, data lineage, incomplete databases, probabilistic databases, semirings, datalog, formal power series



Date Posted: 08 June 2007

This document has been peer reviewed.