Date of this Version
XPath is a W3C standard that plays a crucial role in several influential query, transformation, and schema standards for XML. Motivated by the larger challenge of XML query optimization, we investigate the problem of containment of XPath expressions under integrity constraints that are in turn formulated with the help of XPath expressions. Our core formalism consists of a fragment of XPath that we call simple and a corresponding class of integrity constraints that we call simple XPath integrity constraints (SXIC). SXIC's can express many database-style constraints, including key and foreign key constraints specified in the XML Schema standard proposal, as well as many constraints implied by DTDs. We identify a subclass of bounded SXIC's under which containment of simple XPath expressions is decidable, but we show that even modest use of unbounded SXIC's makes the problem undecidable. In particular, the addition of (unbounded) constraints implied by DTDs leads to undecidability.
We give tight Πp2 bounds for the simple XPath containment problem and tight NP bounds for the disjunction-free subfragment, while even identifying a PTIME subcase. We also show that decidability of containment under SXIC's still holds if the expressions contain certain additional features (e.g., wildcard) although the complexity jumps to Πp2 even for the disjunction-free subfragment.
We know that our results can be extended to some but not all of the XPath features that depend on document order. The decidability of containment of simple XPath expressions in the presence of DTDs only remains open (although we can show that the problem is PSPACE-hard) as well as the problem for full-fledged XPath expressions, even in the absence of integrity constraints.
Date Posted: 08 May 2005