Databases and Information Systems
Exact query reformulation using views in positive relational languages is well understood, and has a variety of applications in query optimization and data sharing. Generalizations to larger fragments of the relational algebra (RA) --- specifically, support for the difference operator --- would increase the options available for query reformulation, and also apply to view adaptation (updating a materialized view in response to a modified view definition) and view maintenance. Unfortunately, most questions about queries become undecidable in the presence of difference/negation. We present a novel way of managing this difficulty via an excursion through a non-standard semantics, Z-relations, where tuples are annotated with positive or negative integers. We show that under Z-semantics RA queries have a normal form as a single difference of positive queries and this leads to the decidability of equivalence. In most real-world settings with difference, it is possible to convert the queries to this normal form. We give a sound and complete algorithm that explores all reformulations of an RA query (under Z-semantics) using a set of RA views, finitely bounding the search space with a simple and natural cost model. We investigate related complexity questions, and we also extend our results to queries with built-in predicates. Z-relations are interesting in their own right because they capture updates and data uniformly. However, our algorithm turns out to be sound and complete also for bag semantics, albeit necessarily only for a subclass of RA. This subclass turns out to be quite large and covers generously the applications of interest to us. We also show a subclass of RA where reformulation and evaluation under Z-semantics can be combined with duplicate elimination to obtain the answer under set semantics.