Graph Sketches: Sparsification, Spanners, and Subgraphs

Loading...
Thumbnail Image
Penn collection
Departmental Papers (CIS)
Degree type
Discipline
Subject
Streaming
Graph Sparsification
Spanners
Sketches
Theory and Algorithms
Funder
Grant number
License
Copyright date
Distributor
Related resources
Author
Ahn, KookJin
Mcgregor, Andrew
Contributor
Abstract

When processing massive data sets, a core task is to construct synopses of the data. To be useful, a synopsis data structure should be easy to construct while also yielding good approximations of the relevant properties of the data set. A particularly useful class of synopses are sketches, i.e., those based on linear projections of the data. These are applicable in many models including various parallel, stream, and compressed sensing settings. A rich body of analytic and empirical work exists for sketching numerical data such as the frequencies of a set of entities. Our work investigates graph sketching where the graphs of interest encode the relationships between these entities. The main challenge is to capture this richer structure and build the necessary synopses with only linear measurements. In this paper we consider properties of graphs including the size of the cuts, the distances between nodes, and the prevalence of dense sub-graphs. Our main result is a sketch-based sparsifier construction: we show that O̅(nε-2) random linear projections of a graph on n nodes suffice to (1 + ε) approximate all cut values. Similarly, we show that O(ε-2) linear projections suffice for (additively) approximating the fraction of induced sub-graphs that match a given pattern such as a small clique. Finally, for distance estimation we present sketch-based spanner constructions. In this last result the sketches are adaptive, i.e., the linear projections are performed in a small number of batches where each projection may be chosen dependent on the outcome of earlier sketches. All of the above results immediately give rise to data stream algorithms that also apply to dynamic graph streams where edges are both inserted and deleted. The non-adaptive sketches, such as those for sparsification and subgraphs, give us single-pass algorithms for distributed data streams with insertion and deletions. The adaptive sketches can be used to analyze MapReduce algorithms that use a small number of rounds.

Advisor
Date of presentation
2012-03-16
Conference name
Departmental Papers (CIS)
Conference dates
2023-05-17T06:54:05.000
Conference location
Date Range for Data Collection (Start Date)
Date Range for Data Collection (End Date)
Digital Object Identifier
Series name and number
Volume number
Issue number
Publisher
Publisher DOI
Journal Issue
Comments
Ahn, K. J., Guha, S., & McGregor, A. Graph Sketches: Sparsification, Spanners, and Subgraphs. SIGMOD Symposium on Principles of Database Systems (PODS 2012). Scottsdale, Arizona, USA. May 20-24, 2012.http://www.sigmod.org/2012/">http://www.sigmod.org/2012/ ©ACM, 2012. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version to be published in PODS '12:Proceedings of the thirty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems (http://portal.acm.org/event.cfm?id=RE227&CFID=24590974&CFTOKEN=88208658). http://www.acm.org
Recommended citation
Collection