Date of this Version
This paper focuses on the optimization of the navigation through voluminous subsumption hierarchies of topics employed by Portal Catalogs like Netscape Open Directory (ODP). We advocate for the use of labeling schemes for modeling these hierarchies in order to efficiently answer queries such as subsumption check, descendants, ancestors or nearest common ancestor, which usually require costly transitive closure computations. We first give a qualitative comparison of three main families of schemes, namely bit vector, prefix and interval based schemes. We then show that two labeling schemes are good candidates for an efficient implementation of label querying using standard relational DBMS, namely the Dewey Prefix scheme and an Interval scheme by Agrawal, Borgida and Jagadish. We compare their storage and query evaluation performance for the 16 ODP hierarchies using the PostgreSQL engine.
Date Posted: 26 June 2007