Database Research Group (CIS)

Document Type

Journal Article

Date of this Version

March 2001


Postprint version. Published in IBM Systems Journal, Volume 40, Issue 2, March 2001, pages 512-531.
Publisher URL:


The integration of heterogeneous data sources and software systems is a major issue in the biomed ical community and several approaches have been explored: linking databases, "on-the- fly" integration through views, and integration through warehousing. In this paper we report on our experiences with two systems that were developed at the University of Pennsylvania: an integration system called K2, which has primarily been used to provide views over multiple external data sources and software systems; and a data warehouse called GUS which downloads, cleans, integrates and annotates data from multiple external data sources. Although the view and warehouse approaches each have their advantages, there is no clear "winner". Therefore, users must consider how the data is to be used, what the performance guarantees must be, and how much programmer time and expertise is available to choose the best strategy for a particular application.



Date Posted: 11 June 2007

This document has been peer reviewed.