Date of this Version
In recent years, data management has begun to consider situations in which data access is closely tied to network routing and distributed acquisition: sensor networks, in which reachability and contiguous regions are of interest; declarative networking, in which shortest paths and reachability are key; distributed and peer-to-peer stream systems, in which we may monitor for associations among data at the distributed sources (e.g., transitive relationships). In each case, the fundamental operation is to maintain a view over dynamic network state; the view is frequently distributed, recursive and may contain aggregation, e.g., describing transitive connectivity, shortest paths, least costly paths, or region membership.
Surprisingly, solutions to this problem are often domain-specific, expensive to compute, and incomplete. In this paper, we recast the problem as one of incremental recursive view maintenance in the presence of distributed streams of updates to tuples: new stream data becomes insert operations and tuple expirations become deletions. We develop a set of techniques that maintain information about tuple derivability—a compact form of data provenance. We complement this with techniques to reduce communication: aggregate selections to prune irrelevant aggregation tuples, provenance-aware operators that can determine when tuples are no longer derivable and remove them from their state, and shipping operators that greatly reduce the tuple and provenance information being propagated while still maintaining correct answers. We validate our work in a distributed setting with sensor and network router queries, showing significant gains in bandwidth consumption without sacrificing performance.
Date Posted: 17 November 2009