Back to Search Start Over

PASTA with many application-aware optimization criteria for alignment based phylogeny inference.

Authors :
Nayeem, Muhammad Ali
Bayzid, Md. Shamsuzzoha
Samudro, Naser Anjum
Rahman, M. Saifur
Rahman, M. Sohel
Source :
Computational Biology & Chemistry. Jun2022, Vol. 98, pN.PAG-N.PAG. 1p.
Publication Year :
2022

Abstract

Multiple sequence alignment (MSA) is a prerequisite for several analyses in bioinformatics, such as, phylogeny estimation, protein structure prediction, etc. PASTA (Practical Alignments using SATé and TrAnsitivity) is a state-of-the-art method for computing MSAs, well-known for its accuracy and scalability. It iteratively co-estimates both MSA and maximum likelihood (ML) phylogenetic tree. It attempts to exploit the close association between the accuracy of an MSA and the corresponding tree while finding the output through multiple iterations from both directions. Currently, PASTA uses the ML score as its optimization criterion which is a good score in phylogeny estimation but cannot be proven as a necessary and sufficient criterion to produce an accurate phylogenetic tree. Therefore, the integration of multiple application-aware objectives into PASTA, which are carefully chosen considering their better association to the tree accuracy, may potentially have a profound positive impact on its performance. This paper has employed four application-aware objectives alongside ML score to develop a multi-objective (MO) framework, namely, PMAO that leverages PASTA to generate a bunch of high-quality solutions that are considered equivalent in the context of conflicting objectives under consideration. our experimental analysis on a popular biological benchmark reveals that the tree-space generated by PMAO contains significantly better trees than stand-alone PASTA. To help the domain experts further in choosing the most appropriate tree from the PMAO output (containing a relatively large set of high-quality solutions), we have added an additional component within the PMAO framework that is capable of generating a smaller set of high-quality solutions. Finally, we have attempted to obtain a single high-quality solution without using any external evidences and have found that summarizing the few solutions detected through the above component can serve this purpose to some extent. [Display omitted] • PMAO framework integrates many application-aware objectives into PASTA through multi-objective optimization for better phylogeny estimation. • We innovatively employ supervised machine learning as well as some simple criteria within the PMAO framework to assist the domain expert. • We experiment with summarizing the PMAO output trees to obtain a single high-quality solution without using any external evidence. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
14769271
Volume :
98
Database :
Academic Search Index
Journal :
Computational Biology & Chemistry
Publication Type :
Academic Journal
Accession number :
157124107
Full Text :
https://doi.org/10.1016/j.compbiolchem.2022.107661