Date of this Version
The integration of heterogeneous data sources and software systems is a major issue in the biomed ical community and several approaches have been explored: linking databases, "on-the- fly" integration through views, and integration through warehousing. In this paper we report on our experiences with two systems that were developed at the University of Pennsylvania: an integration system called K2, which has primarily been used to provide views over multiple external data sources and software systems; and a data warehouse called GUS which downloads, cleans, integrates and annotates data from multiple external data sources. Although the view and warehouse approaches each have their advantages, there is no clear "winner". Therefore, users must consider how the data is to be used, what the performance guarantees must be, and how much programmer time and expertise is available to choose the best strategy for a particular application.
Date Posted: 11 June 2007
This document has been peer reviewed.