Back to Search Start Over

Protein domain architectures provide a fast, efficient and scalable alternative to sequence-based methods for comparative functional genomics [version 3; referees: 1 approved, 2 approved with reservations]

Authors :
Jasper J. Koehorst
Edoardo Saccenti
Peter J. Schaap
Vitor A. P. Martins dos Santos
Maria Suarez-Diez
Author Affiliations :
<relatesTo>1</relatesTo>Laboratory of Systems and Synthetic Biology, Wageningen University and Research, Wageningen, Netherlands<br /><relatesTo>2</relatesTo>LifeGlimmer GmBH, Berlin, Germany
Source :
F1000Research. 5:1987
Publication Year :
2017
Publisher :
London, UK: F1000 Research Limited, 2017.

Abstract

A functional comparative genome analysis is essential to understand the mechanisms underlying bacterial evolution and adaptation. Detection of functional orthologs using standard global sequence similarity methods faces several problems; the need for defining arbitrary acceptance thresholds for similarity and alignment length, lateral gene acquisition and the high computational cost for finding bi-directional best matches at a large scale. We investigated the use of protein domain architectures for large scale functional comparative analysis as an alternative method. The performance of both approaches was assessed through functional comparison of 446 bacterial genomes sampled at different taxonomic levels. We show that protein domain architectures provide a fast and efficient alternative to methods based on sequence similarity to identify groups of functionally equivalent proteins within and across taxonomic boundaries, and it is suitable for large scale comparative analysis. Running both methods in parallel pinpoints potential functional adaptations that may add to bacterial fitness.

Details

ISSN :
20461402
Volume :
5
Database :
F1000Research
Journal :
F1000Research
Notes :
Revised Amendments from Version 2 We have amended the manuscript as suggested by the reviewer. Specifically: The Abstract and Introduction no longer state that the requirements of the SB approach, time and memory, need to scale quadratically with the number of genomes. We have modified the Discussion to further emphasize that DAB is similar to SB methods which extend existing groups into new genomes. We have also rephrased the reviewers’ comment regarding the extensive use of DAB to define domain families, as we think it might further clarify the text. The sentence “Our aim was to investigate whether using HMMs instead of sequence similarity would yield similar results” has been modified as suggested, to: “Our aim was to investigate whether using domain architectures instead of sequence similarity alone would yield similar results.”, , [version 3; referees: 1 approved, 2 approved with reservations]
Publication Type :
Academic Journal
Accession number :
edsfor.10.12688.f1000research.9416.3
Document Type :
research-article
Full Text :
https://doi.org/10.12688/f1000research.9416.3