Technical Reports (CIS)

Document Type

Technical Report

Date of this Version

January 2002


University of Pennsylvania Department of Computer and Information Science Technical Report No. MS-CIS-02-03.


The role of XML in data exchange is evolving from one of merely conveying the structure of data to one that also conveys its semantics. In particular, several proposals for key and foreign key constraints have recently appeared, and aspects of these proposals have been adopted within XMLSchema. Although several validators for XMLSchema appear to check for keys, relatively little attention has been paid to the general problem of how to check constraints in XML.

In this paper, we examine the problem of checking keys in XML documents and describe a native validator based on SAX. The algorithm relies on an indexing technique based on the paths found in key definitions, and can be used for checking the correctness of an entire document (bulk checking) as well as for checking updates as they are made to the document (incremental checking). The asymptotic performance of the algorithm is linear in the size of the document or update. We also discuss how XML keys can be checked in relational representations of XMLdocuments, and compare the performance of our native validator against hand-coded relational constraints. Extrapolating from this experience, we propose how a relational schema can be designed to check XMLSchema key constraints using efficient relational PRIMARY KEY or UNIQUE constraints.



Date Posted: 04 August 2005