## Departmental Papers (CIS)

#### Date of this Version

June 2004

#### Document Type

Conference Paper

#### Recommended Citation

Michael B. Greenwald and Sanjeev Khanna, "Power-Conserving Computation of Order-Statistics over Sensor Networks", . June 2004.

#### Abstract

We study the problem of power-conserving computation of order statistics in sensor networks. Significant power-reducing optimizations have been devised for computing simple aggregate queries such as COUNT, AVERAGE, or MAX over sensor networks. In contrast, aggregate queries such as MEDIAN have seen little progress over the brute force approach of forwarding all data to a central server. Moreover, battery life of current sensors seems largely determined by communication costs - therefore we aim to minimize the number of bytes transmitted. Unoptimized aggregate queries typically impose extremely high power consumption on a subset of sensors located near the server. Metrics such as total communication cost underestimate the penalty of such imbalance: network lifetime may be dominated by the worst-case replacement time for depleted batteries.

In this paper, we design the first algorithms for computing order-statistics such that power consumption is balanced across the entire network. Our first main result is a distributed algorithm ε-approximate quantile summary of the sensor data such that each sensor transmits only *O*(log^{2}*n*/ε) data values, *irrespective of the network topology*, an improvement over the current worst-case behavior of Ω(*n*). Second, we show an improved result when the height, *h*, of the network is significantly smaller than *n*. Our third result is that we can exactly compute any order statistic (e.g., median) in a distributed manner such that each sensor needs to transmit *O*(log^{3}*n*) values.

Further, we design the aggregates used by our algorithms to be *decomposable*. An aggregate *Q* over a set *S* is decomposable if there exists a function, *f*, such that for all *S* = *S*_{1} ∪ *S*_{2}, *Q*(*S*) = *f*(*Q*(*S*_{1}),*Q*(*S*_{2})). We can thus directly apply existing optimizations to decomposable aggregates that inrease error-resilience and reduce communication cost.

Finally, we validate our results empirically, through simulation. When we compute the median exactly, we show that, even for moderate size networks, the worst communication cost for any single node is several times smaller than the corresponding cost in prior median algorithms. We show similar cost reductions when computing approximate order-statistic summaries with guaranteed precision. In all cases, our total communication cost over the entire network is smaller than or equal to the total cost of prior algorithms.

**Date Posted:** 22 December 2005

## Comments

Postprint version. Copyright ACM, 2004. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in

Proceedings of the 23rd ACM Symposium on Principles of Database Systems (PODS 2004), pages 275-285.Publisher URL: http://doi.acm.org/10.1145/1055558.1055597