Back to Search
Start Over
SonicParanoid2: fast, accurate, and comprehensive orthology inference with machine learning and language models
- Source :
- Genome Biology, Vol 25, Iss 1, Pp 1-18 (2024)
- Publication Year :
- 2024
- Publisher :
- BMC, 2024.
-
Abstract
- Abstract Accurate inference of orthologous genes constitutes a prerequisite for comparative and evolutionary genomics. SonicParanoid is one of the fastest tools for orthology inference; however, its scalability and accuracy have been hampered by time-consuming all-versus-all alignments and the existence of proteins with complex domain architectures. Here, we present a substantial update of SonicParanoid, where a gradient boosting predictor halves the execution time and a language model doubles the recall. Application to empirical large-scale and standardized benchmark datasets shows that SonicParanoid2 is much faster than comparable methods and also the most accurate. SonicParanoid2 is available at https://gitlab.com/salvo981/sonicparanoid2 and https://zenodo.org/doi/10.5281/zenodo.11371108 .
Details
- Language :
- English
- ISSN :
- 1474760X
- Volume :
- 25
- Issue :
- 1
- Database :
- Directory of Open Access Journals
- Journal :
- Genome Biology
- Publication Type :
- Academic Journal
- Accession number :
- edsdoj.9941d17d57cb43df95f40d7e731b04ed
- Document Type :
- article
- Full Text :
- https://doi.org/10.1186/s13059-024-03298-4