A Data Transformation System for Biological Data Sources

Loading...
Thumbnail Image
Penn collection
Database Research Group (CIS)
Degree type
Discipline
Subject
Funder
Grant number
License
Copyright date
Distributor
Related resources
Author
Hart, Kyle
Overton, Chris
Wong, L.
Contributor
Abstract

Scientific data of importance to biologists in the Human Genome Project resides not only in conventional databases, but in structured files maintained in a number of different formats (e.g. ASN.1 and ACE) as well as sequence analysis packages (e.g. BLAST and FASTA). These formats and packages contain a number of data types not found in conventional databases, such as lists and variants, and may be deeply nested. We present in this paper techniques for querying and transforming such data, and illustrate their use in a prototype system developed in conjunction with the Human Genome Center for Chromosome 22. We also describe optimizations performed by the system, a crucial issue for bulk data.

Advisor
Date of presentation
1995-09-11
Conference name
Database Research Group (CIS)
Conference dates
2023-05-17T00:43:14.000
Conference location
Date Range for Data Collection (Start Date)
Date Range for Data Collection (End Date)
Digital Object Identifier
Series name and number
Volume number
Issue number
Publisher
Publisher DOI
Journal Issue
Comments
Postprint version. Published in Proceedings of the 21th International Conference on Very Large Data Bases, September 1995, pages 158-169. Publisher URL: http://hdl.library.upenn.edu/1017/11047
Recommended citation
Collection