1. UPGMA and the normalized equidistant minimum evolution problem
- Author
-
Vincent Moulton, Taoyang Wu, and Andreas Spillner
- Subjects
FOS: Computer and information sciences ,0301 basic medicine ,Discrete Mathematics (cs.DM) ,General Computer Science ,Heuristic (computer science) ,0206 medical engineering ,02 engineering and technology ,Computational Complexity (cs.CC) ,Theoretical Computer Science ,Combinatorics ,03 medical and health sciences ,Tree (descriptive set theory) ,Quantitative Biology - Populations and Evolution ,Cluster analysis ,Greedy algorithm ,Time complexity ,Mathematics ,Discrete mathematics ,Populations and Evolution (q-bio.PE) ,UPGMA ,Approximation algorithm ,Hierarchical clustering ,Computer Science - Computational Complexity ,030104 developmental biology ,FOS: Biological sciences ,020602 bioinformatics ,Computer Science - Discrete Mathematics - Abstract
UPGMA (Unweighted Pair Group Method with Arithmetic Mean) is a widely used clustering method. Here we show that UPGMA is a greedy heuristic for the normalized equidistant minimum evolution (NEME) problem, that is, finding a rooted tree that minimizes the minimum evolution score relative to the dissimilarity matrix among all rooted trees with the same leaf-set in which all leaves have the same distance to the root. We prove that the NEME problem is NP-hard. In addition, we present some heuristic and approximation algorithms for solving the NEME problem, including a polynomial time algorithm that yields a binary, rooted tree whose NEME score is within O(log^2 n) of the optimum. We expect that these results to eventually provide further insights into the behavior of the UPGMA algorithm., Comment: 29 pages, 8 figures
- Published
- 2018
- Full Text
- View/download PDF