1. A ‘Stochastic Safety Radius’ for Distance-Based Tree Reconstruction
- Author
-
Mike Steel, Olivier Gascuel, Méthodes et Algorithmes pour la Bioinformatique (MAB), Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier (LIRMM), Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM), Centre for Ultrahigh Bandwidth Devices for Optical Systems (CUDOS), and Macquarie University
- Subjects
0301 basic medicine ,Mathematical optimization ,General Computer Science ,Applied Mathematics ,Populations and Evolution (q-bio.PE) ,Radius ,Interval tree ,Upper and lower bounds ,Computer Science Applications ,03 medical and health sciences ,Tree (data structure) ,030104 developmental biology ,FOS: Biological sciences ,Theory of computation ,[INFO.INFO-BI]Computer Science [cs]/Bioinformatics [q-bio.QM] ,Variety (universal algebra) ,Quantitative Biology - Populations and Evolution ,Algorithm ,Mathematics ,Distance based ,Vantage-point tree - Abstract
A variety of algorithms have been proposed for reconstructing trees that show the evolutionary relationships between species by comparing differences in genetic data across present-day taxa. If the leaf-to-leaf distances in a tree can be accurately estimated, then it is possible to reconstruct this tree from these estimated distances, using polynomial-time methods such as the popular `Neighbor-Joining' algorithm. There is a precise combinatorial condition under which distance-based methods are guaranteed to return a correct tree (in full or in part) based on the requirement that the input distances all lie within some `safety radius' of the true distances. Here, we explore a stochastic analogue of this condition, and mathematically establish upper and lower bounds on this `stochastic safety radius' for distance-based tree reconstruction methods. Using simulations, we show how this notion provides a new way to compare the performance of distance-based tree reconstruction methods. This may help explain why Neighbor-Joining performs so well, as its stochastic safety radius appears close to optimal (while its more classical safety radius is the same as many other less accurate methods)., 18 pages, 1 figure, 4 tables
- Published
- 2015
- Full Text
- View/download PDF