
Departmental Papers (CIS)
Document Type
Conference Paper
Date of this Version
September 2000
Abstract
Over the last few years, efficient access to heterogenous data sources has become tremendously important. One common technique for increasing efficiency is to maintain locally sorted views in data warehouses, which must be kept current with respect to the changes in the underlying data sources. While this problem has been extensively studied in the context of select-project-join (SPJ) views and relational warehouses, many of the data sources accessible today over the Web are highly irregular. Views over this irregular data often perform complex restructuring and regrouping far beyond traditional SPJ views.
This paper describes WHAX (Warehouse Architecture for XML), an architecture for defining and maintaining views over hierarchical semistructured data and relational data sources with key constraints. The WHAX model is a variant of the deterministic model of [8], but is more reminiscent of XML. The view definition language is a variation of XML-QL that has been adapted to the WHAX model, and supports selections, joins, and important restructuring operations such as regrouping, flattening, and aggregation. The incremental maintenance technique is based on the notion of multi-linearity and generalizes several well-known techniques from the relational case.
Date Posted: 08 May 2005

Comments
Postprint version. Published in Lecture Notes in Computer Science, Volume 1874, Proceedings of the Second International Conference on Data Warehousing and Knowledge Discovery 2000 (DaWaK 2000), pages 114-123.
Publisher URL: http://springerlink.metapress.com/link.asp?id=1g7xvlhfk2phxl5a