Joint Estimation of DNA Copy Number From Multiple Platforms

Loading...
Thumbnail Image
Penn collection
Statistics Papers
Degree type
Discipline
Subject
Applied Statistics
Bioinformatics
Biostatistics
Genetics and Genomics
Statistics and Probability
Funder
Grant number
License
Copyright date
Distributor
Related resources
Author
Zhang, Nancy
Senbabaoglu, Yasin
Li, Jun Z
Contributor
Abstract

DNA copy number variants (CNV) are gains and losses of segments of chromosomes, and comprise an important class of genetic variation. Recently, various microarray hybridization based techniques have been developed for high throughput measurement of DNA copy number. In many studies, multiple technical platforms or different versions of the same platform were used to interrogate the same samples; and it became necessary to pool information across these multiple sources to derive a consensus molecular profile for each sample. An integrated analysis is expected to maximize resolution and accuracy, yet currently there is no well formulated statistical method to address the between-platform differences in probe design, assay methods, sensitivity, and analytical complexity. The conventional approach is to apply one of the CNV detection (a.k.a. “segmentation”) algorithms to search for DNA segments of altered signal intensity. The results from three platforms are combined after segmentation. Here we propose a new method, Multi-Platform Circular Binary Segmentation (MPCBS), which pools statistical evidence across platforms during segmentation, and does not require pre-standardization of different data sources. It involves a weighted sum of t-statistics, which arises naturally from the generalized log-likelihood ratio of a multi-platform model. We show by comparing the integrated analysis of Affymetrix and Illumina SNP array data with fosmid clone end-sequencing results on 8 HapMap samples that MPCBS achieves improved spatial resolution, detection power, and provide a natural consensus across platforms. We also apply the new method to analyze the multi-platform data from TCGA. The R package for MPCBS is registered on R-Forge under project name MPCBS

Advisor
Date Range for Data Collection (Start Date)
Date Range for Data Collection (End Date)
Digital Object Identifier
Series name and number
Publication date
2010-01-15
Journal title
Bioinformatics
Volume number
Issue number
Publisher
Publisher DOI
Journal Issue
Comments
Recommended citation
Collection