Phylogenetic Information Complexity: Is Testing a Tree Easier Than Finding It?

Loading...
Thumbnail Image
Penn collection
Statistics Papers
Degree type
Discipline
Subject
phylogenetic tree
information content
sequence length
reconstruction
Applied Mathematics
Biology
Statistics and Probability
Funder
Grant number
License
Copyright date
Distributor
Related resources
Author
Steel, Mike
Székely, Laszlo
Mossel, Elchanan
Contributor
Abstract

Phylogenetic trees describe the evolutionary history of a group of present-day species from a common ancestor. These trees are typically reconstructed from aligned DNA sequence data. In this paper we analytically address the following question: Is the amount of sequence data required to accurately reconstruct a tree significantly more than the amount required to test whether or not a candidate tree was the ‘true’ tree? By ‘significantly’, we mean that the two quantities do not behave the same way as a function of the number of species being considered. We prove that, for a certain type of model, the amount of information required is not significantly different; while for another type of model, the information required to test a tree is independent of the number of leaves, while that required to reconstruct it grows with this number. Our results combine probabilistic and combinatorial arguments.

Advisor
Date Range for Data Collection (Start Date)
Date Range for Data Collection (End Date)
Digital Object Identifier
Series name and number
Publication date
2009-05-07
Journal title
Journal of Theoretical Biology
Volume number
Issue number
Publisher
Publisher DOI
Journal Issue
Comments
Recommended citation
Collection