Start Over

A Comprehensive Benchmark Study of Multiple Sequence Alignment Methods: Current Challenges and Future Perspectives

Authors :: Benjamin Linard
Odile Lecompte
Julie D. Thompson
Olivier Poch
Institut de Génétique et de Biologie Moléculaire et Cellulaire (IGBMC)
Université de Strasbourg (UNISTRA)-Institut National de la Santé et de la Recherche Médicale (INSERM)-Centre National de la Recherche Scientifique (CNRS)
linard, benjamin
Source :: PLoS ONE, PLoS ONE, Public Library of Science, 2011, 6 (3), pp.18093-18093. ⟨10.1371/journal.pone.0018093⟩, PLoS ONE, Vol 6, Iss 3, p e18093 (2011), PLoS ONE, 2011, 6 (3), pp.18093-18093. ⟨10.1371/journal.pone.0018093⟩
Publication Year :: 2011
Publisher :: HAL CCSD, 2011.
Abstract: International audience; Multiple comparison or alignmentof protein sequences has become a fundamental tool in many different domains in modern molecular biology, from evolutionary studies to prediction of 2D/3D structure, molecular function and inter-molecular interactions etc. By placing the sequence in the framework of the overall family, multiple alignments can be used to identify conserved features and to highlight differences or specificities. In this paper, we describe a comprehensive evaluation of many of the most popular methods for multiple sequence alignment (MSA), based on a new benchmark test set. The benchmark is designed to represent typical problems encountered when aligning the large protein sequence sets that result from today's high throughput biotechnologies. We show that alignmentmethods have significantly progressed and can now identify most of the shared sequence features that determine the broad molecular function(s) of a protein family, even for divergent sequences. However,we have identified a number of important challenges. First, the locally conserved regions, that reflect functional specificities or that modulate a protein's function in a given cellular context,are less well aligned. Second, motifs in natively disordered regions are often misaligned. Third, the badly predicted or fragmentary protein sequences, which make up a large proportion of today's databases, lead to a significant number of alignment errors. Based on this study, we demonstrate that the existing MSA methods can be exploited in combination to improve alignment accuracy, although novel approaches will still be needed to fully explore the most difficult regions. We then propose knowledge-enabled, dynamic solutions that will hopefully pave the way to enhanced alignment construction and exploitation in future evolutionary systems biology studies.

Subjects :: Proteomics
media_common.quotation_subject
Protein domain
Gene Identification and Analysis
lcsh:Medicine
Sequence alignment
Context (language use)
Computational biology
Biology
[SDV.BID.SPT]Life Sciences [q-bio]/Biodiversity/Systematics, Phylogenetics and taxonomy
Molecular Genetics
03 medical and health sciences
[SDV.BBM.GTP]Life Sciences [q-bio]/Biochemistry, Molecular Biology/Genomics [q-bio.GN]
[SDV.BID.SPT] Life Sciences [q-bio]/Biodiversity/Systematics, Phylogenetics and taxonomy
Genetics
Amino Acid Sequence
Function (engineering)
lcsh:Science
Alignment-free sequence analysis
030304 developmental biology
media_common
[INFO.INFO-BI] Computer Science [cs]/Bioinformatics [q-bio.QM]
0303 health sciences
Sequence
Evolutionary Biology
Multidisciplinary
Multiple sequence alignment
Sequence Homology, Amino Acid
030302 biochemistry & molecular biology
lcsh:R
Computational Biology
Proteins
Genomics
Comparative Genomics
[SDE.BE] Environmental Sciences/Biodiversity and Ecology
Computer Science
Benchmark (computing)
[SDV.BBM.GTP] Life Sciences [q-bio]/Biochemistry, Molecular Biology/Genomics [q-bio.GN]
lcsh:Q
Gene Function
[SDE.BE]Environmental Sciences/Biodiversity and Ecology
[INFO.INFO-BI]Computer Science [cs]/Bioinformatics [q-bio.QM]
Sequence Alignment
Sequence Analysis
Algorithms
Research Article

Details

Language :: English
ISSN :: 19326203
Database :: OpenAIRE
Journal :: PLoS ONE, PLoS ONE, Public Library of Science, 2011, 6 (3), pp.18093-18093. ⟨10.1371/journal.pone.0018093⟩, PLoS ONE, Vol 6, Iss 3, p e18093 (2011), PLoS ONE, 2011, 6 (3), pp.18093-18093. ⟨10.1371/journal.pone.0018093⟩
Accession number :: edsair.doi.dedup.....93224cb0eda4d4cc686b77c1c2825951
Full Text :: https://doi.org/10.1371/journal.pone.0018093⟩

Full Text Access

View/download PDF

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

A Comprehensive Benchmark Study of Multiple Sequence Alignment Methods: Current Challenges and Future Perspectives

Abstract

Subjects

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

A Comprehensive Benchmark Study of Multiple Sequence Alignment Methods: Current Challenges and Future Perspectives

Abstract

Subjects

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources