Computing with Distributed Information

Yang Li, University of Pennsylvania


The age of computing with massive data sets is highlighting new computational challenges. Nowadays, a typical server may not be able to store an entire data set, and thus data is often partitioned and stored on multiple servers in a distributed manner. A natural way of computing with such distributed data is to use distributed algorithms: these are algorithms where the participating parties (i.e., the servers holding portions of the data) collaboratively compute a function over the entire data set by sending (preferably small-size) messages to each other, where the computation performed at each participating party only relies on the data possessed by it and the messages received by it. We study distributed algorithms focused on two key themes: convergence time and data summarization. Convergence time measures how quickly a distributed algorithm settles on a globally stable solution, and data summarization is the approach of creating a compact summary of the input data while retaining key information. The latter often leads to more efficient computation and communication. The main focus of this dissertation is on design and analysis of distributed algorithms for important problems in diverse application domains centering on the themes of convergence time and data summarization. Some of the problems we study include convergence time of double oral auction and interdomain routing, summarizing graphs for large-scale matching problems, and summarizing data for query processing.

Subject Area

Information science|Computer science

Recommended Citation

Li, Yang, "Computing with Distributed Information" (2017). Dissertations available from ProQuest. AAI10273590.