Date of this Version
Array-based comparative genomic hybridization (array-CGH) is a high throughput, high resolution technique for studying the genetics of cancer. Analysis of array-CGH data typically involves estimation of the underlying chromosome copy numbers from the log fluorescence ratios and segmenting the chromosome into regions with the same copy number at each location. We propose for the analysis of array-CGH data, a new stochastic segmentation model and an associated estimation procedure that has attractive statistical and computational properties. An important benefit of this Bayesian segmentation model is that it yields explicit formulas for posterior means, which can be used to estimate the signal directly without performing segmentation. Other quantities relating to the posterior distribution that are useful for providing confidence assessments of any given segmentation can also be estimated by using our method. We propose an approximation method whose computation time is linear in sequence length which makes our method practically applicable to the new higher density arrays. Simulation studies and applications to real array-CGH data illustrate the advantages of the proposed approach.
This is a post-peer-review, pre-copyedit version of an article published in Biostatistics.
Array-CGH, Bayesian inference, hidden Markov models, jump probabilities
Lai, T., Xing, H., & Zhang, N. (2008). Stochastic Segmentation Models for Array-Based Comparative Genomic Hybridization Data Analysis. Biostatistics, 9 (2), 290-307. http://dx.doi.org/10.1093/biostatistics/kxm031
Date Posted: 27 November 2017
This document has been peer reviewed.