Published in Proceedings of the International Conference on Very Large Data Bases, 29th Edition, September 2003, pages 321-332.

NOTE: At the time of publication, author Boon Thau Loo was affiliated with the University of California at Berkeley. Currently (April 2007), he is a faculty member in the Department of Computer and Information Science at the University of Pennsylvania.


The database research community prides itself on scalable technologies. Yet database systems traditionally do not excel on one important scalability dimension: the degree of distribution. This limitation has hampered the impact of database technologies on massively distributed systems like the Internet.

In this paper, we present the initial design of PIER, a massively distributed query engine based on overlay networks, which is intended to bring database query processing facilities to new, widely distributed environments. We motivate the need for massively distributed queries, and argue for a relaxation of certain traditional database research goals in the pursuit of scalability and widespread adoption. We present simulation results showing PIER gracefully running relational queries across thousands of machines, and show results from the same software base in actual deployment on a large experimental cluster.



