Date of this Version
The Genographic Project is studying the genetic signatures of ancient human migrations and creating an open-source research database. It allows members of the public to participate in a real-time anthropological genetics study by submitting personal samples for analysis and donating the genetic results to the database. We report our experience from the first 18 months of public participation in the Genographic Project, during which we have created the largest standardized human mitochondrial DNA (mtDNA) database ever collected, comprising 78,590 genotypes. Here, we detail our genotyping and quality assurance protocols including direct sequencing of the mtDNA HVS-I, genotyping of 22 coding-region SNPs, and a series of computational quality checks based on phylogenetic principles. This database is very informative with respect to mtDNA phylogeny and mutational dynamics, and its size allows us to develop a nearest neighbor–based methodology for mtDNA haplogroup prediction based on HVS-I motifs that is superior to classic rule-based approaches. We make available to the scientific community and general public two new resources: a periodically updated database comprising all data donated by participants, and the nearest neighbor haplogroup prediction tool.
© 2007 Behar et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
genomic database, genotyping, haplotypes, mitochondrial DNA, mutation databases, phylogenetics, sequence databases
Behar, D. M., Rosset, S., Blue-Smith, J., Balanovsky, O., Tzur, S., Comas, D., Mitchell, R., Quintana-Murci, L., Tyler-Smith, C., Wells, R., Genographic Consortium, & Schurr, T. G. (2007). The Genographic Project Public Participation Mitochondrial DNA Database. PLoS Genetics, 3 (6), e104. https://doi.org/10.1371/journal.pgen.0030104
Additional FilesDataset_S1.xls (10575 kB)
Corrected Dataset S1
Date Posted: 18 December 2014
This document has been peer reviewed.