Date of this Version
We study two classes of view update problems in relational databases. We are given a source database S, a monotone query Q, and the view Q(S) generated by the query. The first problem that we consider is the classical view deletion problem where we wish to identify a minimal set T of tuples in S whose deletion will eliminate a given tuple t from the view. We study the complexity of optimizing two natural objectives in this setting, namely, find T to minimize the side-effects on the view, and the source, respectively. For both objective functions, we show a dichotomy in the complexity. Interestingly, the problem is either in P or is NP-hard, for queries in the same class in either objective function.
The second problem in our study is the annotation placement problem. Suppose we annotate an attribute of a tuple in S. The rules for carrying the annotation forward through a query are easily stated. On the other hand, suppose we annotate an attribute of a tuple in the view Q(S), what annotation(s) in S will cause this annotation to appear in the view, minimizing the propagation to other attributes in Q(S)? View annotation is becoming an increasingly useful method of communicating meta-data among users of shared scientific data sets, and to our knowledge, there has been no formal study of this problem.
Our study of these problems gives us important insights into computational issues involved in data provenance or lineage — the process by which data moves through databases. We show that the two problems correspond to two fundamentally distinct notions of provenance, why and where-provenance.
Date Posted: 22 December 2005