Document Type

Journal Article

Date of this Version


Publication Source

BMC Bioinformatics





Start Page


Last Page






Mitochondrial genome sequence analysis is critical to the diagnostic evaluation of mitochondrial disease. Existing methodologies differ widely in throughput, complexity, cost efficiency, and sensitivity of heteroplasmy detection. Affymetrix MitoChip v2.0, which uses a sequencing-by-genotyping technology, allows potentially accurate and high-throughput sequencing of the entire human mitochondrial genome to be completed in a cost-effective fashion. However, the relatively low call rate achieved using existing software tools has limited the wide adoption of this platform for either clinical or research applications. Here, we report the design and development of a custom bioinformatics software pipeline that achieves a much improved call rate and accuracy for the Affymetrix MitoChip v2.0 platform. We used this custom pipeline to analyze MitoChip v2.0 data from 24 DNA samples representing a broad range of tissue types (18 whole blood, 3 skeletal muscle, 3 cell lines), mutations (a 5.8 kilobase pair deletion and 6 known heteroplasmic mutations), and haplogroup origins. All results were compared to those obtained by at least one other mitochondrial DNA sequence analysis method, including Sanger sequencing, denaturing HPLC-based heteroduplex analysis, and/or the Illumina Genome Analyzer II next generation sequencing platform.


An average call rate of 99.75% was achieved across all samples with our custom pipeline. Comparison of calls for 15 samples characterized previously by Sanger sequencing revealed a total of 29 discordant calls, which translates to an estimated 0.012% for the base call error rate. We successfully identified 4 known heteroplasmic mutations and 24 other potential heteroplasmic mutations across 20 samples that passed quality control.


Affymetrix MitoChip v2.0 analysis using our optimized MitoChip Filtering Protocol (MFP) bioinformatics pipeline now offers the high sensitivity and accuracy needed for reliable, high-throughput and cost-efficient whole mitochondrial genome sequencing. This approach provides a viable alternative of potential utility for both clinical diagnostic and research applications to traditional Sanger and other emerging sequencing technologies for whole mitochondrial genome analysis.

Copyright/Permission Statement

© 2011 Xie et al; licensee BioMed Central Ltd.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Additional Files

supp 1.txt (18 kB)
Source Code

supp 2.ppt (553 kB)
Structural variant detection capacity analysis in MFP

supp 3.ppt (124 kB)
Box plot of ratios between the highest and second highest signal intensities of all bases located in the large deleted region of sample #14

supp 4.xls (33 kB)
Single nucleotide variant discrepancies within 15 DNA samples analyzed both by Affymetrix MitoChip v2.0 with the MFP analysis algorithm and by either Sanger or Illumina Genome Analyzer II Sequencing methods

supp 5.ppt (328 kB)
Alignment of Illumina GA next generation sequencing reads from position 3433 in sample #15

supp 6.ppt (365 kB)
Alignment of Illumina GA next generation sequencing reads from position 15940 to 15944 in sample #15

supp 7.ppt (175 kB)
Validation data set results comparison between MFP and Sanger sequencing



Date Posted: 10 July 2014

This document has been peer reviewed.