## Departmental Papers (CIS)

#### Document Type

Journal Article

#### Date of this Version

12-19-2008

#### Abstract

We explore problems related to computing graph distances in the data-stream model. The goal is to design algorithms that can process the edges of a graph in an arbitrary order given only a limited amount of working memory. We are motivated by both the practical challenge of processing massive graphs such as the web graph and the desire for a better theoretical understanding of the data-stream model. In particular, we are interested in the trade-offs between model parameters such as per-data-item processing time, total space, and the number of passes that may be taken over the stream. These trade-offs are more apparent when considering graph problems than they were in previous streaming work that solved problems of a statistical nature. Our results include the following: (1) Spanner construction: There exists a single-pass, (O) over tilde (tn(1+1/t))-space, (O) over tilde (t(2)n(1/t))-time-per-edge algorithm that constructs a (2t + 1)-spanner. For t = Omega(log n/log log n), the algorithm satisfies the semistreaming space restriction of O(n polylog n) and has per-edge processing time O(polylog n). This resolves an open question from [J. Feigenbaum et al., Theoret. Comput. Sci., 348 (2005), pp. 207-216]. (2) Breadth-first-search (BFS) trees: For any even constant k, we show that any algorithm that computes the first k layers of a BFS tree from a prescribed node with probability at least 2/3 requires either greater than k/2 passes or Omega(n(1+1/k)) space. Since constructing BFS trees is an important subroutine in many traditional graph algorithms, this demonstrates the need for new algorithmic techniques when processing graphs in the data-stream model. (3) Graph-distance lower bounds: Any t-approximation of the distance between two nodes requires Omega(n(1+1/t)) space. We also prove lower bounds for determining the length of the shortest cycle and other graph properties. (4) Techniques for decreasing per-edge processing: We discuss two general techniques for speeding up the per-edge computation time of streaming algorithms while increasing the space by only a small factor.

#### Keywords

stream algorithms, graph distances, spanners, COMMUNICATION COMPLEXITY, ALGORITHMS, CONSTRUCTION

#### Recommended Citation

Feigenbaum, Joan; Kannan, Sampth; Mcgregor, Andrew; Suri, Siddarth; and Zhang, Jian, "Graph Distances in the Data-Stream Model" (2008). *Departmental Papers (CIS).* Paper 404.

http://repository.upenn.edu/cis_papers/404

**Date Posted:** 21 May 2009

This document has been peer reviewed.

## Comments

Graph Distances in the Data-Stream Model Joan Feigenbaum, Sampath Kannan, Andrew McGregor, Siddharth Suri, and Jian Zhang, SIAM J. Comput. 38, 1709 (2008), DOI:10.1137/070683155

Copyright SIAM, 2008. Reprinted in

SIAM Journal on Computing, Volume 38, Issue 5, pp. 1709-1727.