1. Jerarca: Efficient Analysis of Complex Networks Using Hierarchical Clustering
- Author
-
Rodrigo Aldecoa and Ignacio Marín
- Subjects
FOS: Computer and information sciences ,Physics - Physics and Society ,Computation ,Molecular Networks (q-bio.MN) ,lcsh:Medicine ,FOS: Physical sciences ,Physics and Society (physics.soc-ph) ,Bioinformatics ,Mega ,computer.software_genre ,Software ,Cluster Analysis ,Quantitative Biology - Molecular Networks ,lcsh:Science ,Physics ,Social and Information Networks (cs.SI) ,Multidisciplinary ,Computational Biology/Systems Biology ,business.industry ,Dendrogram ,lcsh:R ,Computational Biology ,Computer Science - Social and Information Networks ,Complex network ,Genetics and Genomics/Bioinformatics ,Hierarchical clustering ,Computational Biology/Signaling Networks ,FOS: Biological sciences ,lcsh:Q ,Data mining ,business ,computer ,Biological network ,Algorithms ,Network analysis ,Research Article ,Signal Transduction - Abstract
7 pages, 3 figures, PMID: 20644733 [PubMed]PMCID: PMC2904377 [PubMed Central]Free PMC Article, BACKGROUND: How to extract useful information from complex biological networks is a major goal in many fields, especially in genomics and proteomics. We have shown in several works that iterative hierarchical clustering, as implemented in the UVCluster program, is a powerful tool to analyze many of those networks. However, the amount of computation time required to perform UVCluster analyses imposed significant limitations to its use. METHODOLOGY/PRINCIPAL FINDINGS: We describe the suite Jerarca, designed to efficiently convert networks of interacting units into dendrograms by means of iterative hierarchical clustering. Jerarca is divided into three main sections. First, weighted distances among units are computed using up to three different approaches: a more efficient version of UVCluster and two new, related algorithms called RCluster and SCluster. Second, Jerarca builds dendrograms based on those distances, using well-known phylogenetic algorithms, such as UPGMA or Neighbor-Joining. Finally, Jerarca provides optimal partitions of the trees using statistical criteria based on the distribution of intra- and intercluster connections. Outputs compatible with the phylogenetic software MEGA and the Cytoscape package are generated, allowing the results to be easily visualized. CONCLUSIONS/SIGNIFICANCE: THE FOUR MAIN ADVANTAGES OF JERARCA IN RESPECT TO UVCLUSTER ARE: 1) Improved speed of a novel UVCluster algorithm; 2) Additional, alternative strategies to perform iterative hierarchical clustering; 3) Automatic evaluation of the hierarchical trees to obtain optimal partitions; and, 4) Outputs compatible with popular software such as MEGA and Cytoscape., This research was supported by grant BIO2008-05067 (Programa Nacional de Biotecnologia; Ministerio de Ciencia e Innovacion, Spain).
- Published
- 2012
- Full Text
- View/download PDF