Date of this Version
Journal of Theoretical Biology
Phylogenetic trees describe the evolutionary history of a group of present-day species from a common ancestor. These trees are typically reconstructed from aligned DNA sequence data. In this paper we analytically address the following question: Is the amount of sequence data required to accurately reconstruct a tree significantly more than the amount required to test whether or not a candidate tree was the ‘true’ tree? By ‘significantly’, we mean that the two quantities do not behave the same way as a function of the number of species being considered. We prove that, for a certain type of model, the amount of information required is not significantly different; while for another type of model, the information required to test a tree is independent of the number of leaves, while that required to reconstruct it grows with this number. Our results combine probabilistic and combinatorial arguments.
© 2009. This manuscript version is made available under the CC-BY-NC-ND 4.0 license.
phylogenetic tree, information content, sequence length, reconstruction
Steel, M., Székely, L., & Mossel, E. (2009). Phylogenetic Information Complexity: Is Testing a Tree Easier Than Finding It?. Journal of Theoretical Biology, 258 (1), 95-102. http://dx.doi.org/10.1016/j.jtbi.2009.01.007
Date Posted: 27 November 2017
This document has been peer reviewed.