Distributed Query Execution With Strong Privacy Guarantees
As the Internet evolves, we find more applications that involve data originating from multiple sources, and spanning machines located all over the world. Such wide distribution of sensitive data increases the risk of information leakage, and may sometimes inhibit useful applications. For instance, even though banks could share data to detect systemic threats in the US financial network, they hesitate to do so because it can leak business secrets to their competitors. Encryption is an effective way to preserve data confidentiality, but eliminates all processing capabilities. Some approaches enable processing on encrypted data, but they usually have security weaknesses, such as data leakage through side-channels, or require expensive cryptographic computations. In this thesis, we present techniques that address the above limitations. First, we present an efficient symmetric homomorphic encryption scheme, which can aggregate encrypted data at an unprecedented scale. Second, we present a way to efficiently perform secure computations on distributed graphs. To accomplish this, we express large computations as a series of small, parallelizable vertex programs, whose state is safely transferred between vertices using a new cryptographic protocol. Finally, we propose using differential privacy to strengthen the security of trusted processors: noise is added to the side-channels, so that no adversary can extract useful information about individual users. Our experimental results suggest that the presented techniques achieve order-of-magnitude performance improvements over previous approaches, in scenarios such as the business intelligence application of a large corporation and the detection of systemic threats in the US financial network.