Departmental Papers (CIS)

Date of this Version

June 2004

Document Type

Conference Paper

Comments

Copyright ACM, 2004. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in Proceedings of the 2004 ACM SIGMOD International Conference on Management of Data, pages 395-406.
Publisher URL: http://doi.acm.org/10.1145/1007568.1007613

Abstract

An effective query optimizer finds a query plan that exploits the characteristics of the source data. In data integration, little is known in advance about sources’ properties, which necessitates the use of adaptive query processing techniques to adjust query processing on-the-fly. Prior work in adaptive query processing has focused on compensating for delays and adjusting for mis-estimated cardinality or selectivity values. In this paper, we present a generalized architecture for adaptive query processing and introduce a new technique, called adaptive data partitioning (ADP), which is based on the idea of dividing the source data into regions, each executed by different, complementary plans. We show how this model can be applied in novel ways to not only correct for underestimated selectivity and cardinality values, but also to discover and exploit order in the source data, and to detect and exploit source data that can be effectively pre-aggregated. We experimentally compare a number of alternative strategies and show that our approach is effective.

Share

COinS
 

Date Posted: 07 May 2005