Now showing 1 - 2 of 2
PublicationStream Order and Order Statistics: Quantile Estimation in Random-Order Streams(2009-01-30) Guha, Sudipto; Mcgregor, AndrewWhen trying to process a data stream in small space, how important is the order in which the data arrive? Are there problems that are unsolvable when the ordering is worst case, but that can be solved (with high probability) when the order is chosen uniformly at random? If we consider the stream as if ordered by an adversary, what happens if we restrict the power of the adversary? We study these questions in the context of quantile estimation, one of the most well studied problems in the data-stream model. Our results include an O(polylogn)-space, O(log log n)-pass algorithm for exact selection in a randomly ordered stream of n elements. This resolves an open question of Munro and Paterson [Theoret. Comput. Sci., 23 (1980), pp. 315-323]. We then demonstrate an exponential separation between the random-order and adversarial-order models: using O(polylog n) space, exact selection requires O(log n/log log n) passes in the adversarial-order model. This lower bound, in contrast to previous results, applies to fully general randomized algorithms and is established via a new bound on the communication complexity of a natural pointer-chasing style problem. We also prove the first fully general lower bounds in the random-order model:. finding an element with rank n/2 +/- n(delta) in the single-pass random-order model with probability at least 9/10 requires Omega(root n(1-3 delta)/log n) space. PublicationA Constant Factor Approximation for the Single Sink Edge Installation Problem(2009-03-27) Guha, Sudipto; Meyerson, Adam; Munagala, KameshWe present the first constant approximation to the single sink buy-at-bulk network design problem, where we have to design a network by buying pipes of different costs and capacities per unit length to route demands at a set of sources to a single sink. The distances in the underlying network form a metric. This result improves the previous bound of O(log |R|), where R is the set of sources. We also present a better constant approximation to the related Access Network Design problem. Our algorithms are randomized and combinatorial. As a subroutine in our algorithm, we use an interesting variant of facility location with lower bounds on the amount of demand an open facility needs to serve. We call this variant load balanced facility location and present a constant factor approximation for it, while relaxing the lower bounds by a constant factor.