Regular Expression Types for XML

Loading...
Thumbnail Image
Penn collection
Departmental Papers (CIS)
Degree type
Discipline
Subject
programming languages
language constructs and features
data types and structures
languages
theory
type systems
XML
subtyping
Funder
Grant number
License
Copyright date
Distributor
Related resources
Author
Hosoya, Haruo
Vouillon, Jerome
Contributor
Abstract

We propose regular expression types as a foundation for statically typed XML processing languages. Regular expression types, like most schema languages for XML, introduce regular expression notations such as repetition (*), alternation (|), etc., to describe XML documents. The novelty of our type system is a semantic presentation of subtyping, as inclusion between the sets of documents denoted by two types. We give several examples illustrating the usefulness of this form of subtyping in XML processing. The decision problem for the subtype relation reduces to the inclusion problem between tree automata, which is known to be EXPTIME-complete. To avoid this high complexity in typical cases, we develop a practical algorithm that, unlike classical algorithms based on determinization of tree automata, checks the inclusion relation by a top-down traversal of the original type expressions. The main advantage of this algorithm is that it can exploit the property that type expressions being compared often share portions of their representations. Our algorithm is a variant of Aiken and Murphy's set-inclusion constraint solver, to which are added several new implementation techniques, correctness proofs, and preliminary performance measurements on some small programs in the domain of typed XML processing.

Advisor
Date Range for Data Collection (Start Date)
Date Range for Data Collection (End Date)
Digital Object Identifier
Series name and number
Publication date
2005-01-01
Journal title
Volume number
Issue number
Publisher
Publisher DOI
Journal Issue
Comments
Copyright ACM, 2005. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in ACM Transactions on Programming Languages and Systems, Volume 27, Issue 1, January 2005, pages 46-90. Publisher URL: http://doi.acm.org/10.1145/1053468.1053470
Copyright ACM, 2005. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in ACM Transactions on Programming Languages and Systems, Volume 27, Issue 1, January 2005, pages 46-90. Publisher URL: http://www.acm.org/toplas/
Recommended citation
Collection