View Maintenance for Hierarchical Semistructured Data
Files
Penn collection
Degree type
Discipline
Subject
Funder
Grant number
License
Copyright date
Distributor
Related resources
Author
Contributor
Abstract
Over the last few years, efficient access to heterogenous data sources has become tremendously important. One common technique for increasing efficiency is to maintain locally sorted views in data warehouses, which must be kept current with respect to the changes in the underlying data sources. While this problem has been extensively studied in the context of select-project-join (SPJ) views and relational warehouses, many of the data sources accessible today over the Web are highly irregular. Views over this irregular data often perform complex restructuring and regrouping far beyond traditional SPJ views. This paper describes WHAX (Warehouse Architecture for XML), an architecture for defining and maintaining views over hierarchical semistructured data and relational data sources with key constraints. The WHAX model is a variant of the deterministic model of [8], but is more reminiscent of XML. The view definition language is a variation of XML-QL that has been adapted to the WHAX model, and supports selections, joins, and important restructuring operations such as regrouping, flattening, and aggregation. The incremental maintenance technique is based on the notion of multi-linearity and generalizes several well-known techniques from the relational case.