Technical Reports (CIS)

Document Type

Technical Report

Date of this Version

January 2002


University of Pennsylvania Department of Computer and Information Science Technical Report No. MS-CIS-02-01.


This paper examines some of the issues that arise in the process of XML publishing of mixed-storage proprietary data. We argue that such data will reside typically in RDBMS's and/or LDAP, etc, augmented with a set of native XML documents. An additional challenge is to take advantage of redundancy in the storage schema, such as mixed materialized views that are stored for the purpose of enhancing performance.

We argue that such systems need to take into consideration mappings in both directions between the proprietary schema and the published schema. Thus, reformulating queries on the (published) XML schema into executable queries on the stored data will require the effect of both composition-with-views (as in SilkRoute and XPERANTO) and rewriting-with-views (as in the Information Manifold and Agora).

Using any of the simple encodings of relational data as XML, the mappings between schemas and the materialized views can be expressed in XQery, just like the queries on the published schema. For query reformulation we give an algorithm that uses logical assertions to capture formally the semantics of a large part of XQuery. We also give a completeness theorem for our reformulation algorithm. The algorithm was implemented in an XML query rewriting system and we present a suite of experiments that validate this technique.



Date Posted: 04 August 2005