Back to Search Start Over

CProtMEDIAS: clustering of amino acid sequences encoded by gene families by MErging and DIgitizing Aligned Sequences.

Authors :
Zhang, Zhe
Zhu, Miaomiao
Xie, Qi
Larkin, Robert M
Shi, Xueping
Zheng, Bo
Source :
Briefings in Bioinformatics; Jul2022, Vol. 23 Issue 4, p1-10, 10p
Publication Year :
2022

Abstract

Protein phylogenetic analysis focuses on the evolutionary relationships among related protein sequences and can help researchers infer protein functions and developmental trajectories. With the advent of the big data era, the existing protein phylogenetic methods, including distance matrix and character-based methods, are facing challenges in both running time and application scope. Here, we developed an R package that we call CProtMEDIAS that is useful for protein phylogenetic analysis. In contrast to existing phylogenetic analysis methods, CProtMEDIAS utilizes dimensionality reduction algorithms to digitize multiple sequence alignments and quickly conduct phylogenetic analysis with a large number of amino acid sequences from similarly distant protein families and species. We used CProtMEDIAS to perform a dimensionality reduction, clustering, pseudotime, specific residue and evolutionary trajectory analysis of the plant homeobox superfamily. We found that CProtMEDIAS delivers consistent clustering, fast running and elegant presentation and thus provides powerful new tools and methods for protein clustering and evolutionary analysis. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
14675463
Volume :
23
Issue :
4
Database :
Complementary Index
Journal :
Briefings in Bioinformatics
Publication Type :
Academic Journal
Accession number :
158178127
Full Text :
https://doi.org/10.1093/bib/bbac276