Back to Search
Start Over
Predicting peptide structures in native proteins from physical simulations of fragments
- Source :
- PLoS computational biology, vol 5, iss 2, PLoS Computational Biology, Vol 5, Iss 2, p e1000281 (2009), PLoS Computational Biology
- Publication Year :
- 2009
- Publisher :
- eScholarship, University of California, 2009.
-
Abstract
- It has long been proposed that much of the information encoding how a protein folds is contained locally in the peptide chain. Here we present a large-scale simulation study designed to examine the extent to which conformations of peptide fragments in water predict native conformations in proteins. We perform replica exchange molecular dynamics (REMD) simulations of 872 8-mer, 12-mer, and 16-mer peptide fragments from 13 proteins using the AMBER 96 force field and the OBC implicit solvent model. To analyze the simulations, we compute various contact-based metrics, such as contact probability, and then apply Bayesian classifier methods to infer which metastable contacts are likely to be native vs. non-native. We find that a simple measure, the observed contact probability, is largely more predictive of a peptide's native structure in the protein than combinations of metrics or multi-body components. Our best classification model is a logistic regression model that can achieve up to 63% correct classifications for 8-mers, 71% for 12-mers, and 76% for 16-mers. We validate these results on fragments of a protein outside our training set. We conclude that local structure provides information to solve some but not all of the conformational search problem. These results help improve our understanding of folding mechanisms, and have implications for improving physics-based conformational sampling and structure prediction using all-atom molecular simulations.<br />Author Summary Proteins must fold to unique native structures in order to perform their functions. To do this, proteins must solve a complicated conformational search problem, the details of which remain difficult to study experimentally. Predicting folding pathways and the mechanisms by which proteins fold is thus central to understanding how proteins work. One longstanding question is the extent to which proteins solve the search problem locally, by folding into sub-structures that are dictated primarily by local sequence. Here, we address this question by conducting a large-scale molecular dynamics simulation study of protein fragments in water. The simulation data was then used to optimize a statistical model that predicted native and non-native contacts. The performance of the resulting model suggests that local structuring provides some but not all of the information to solve the folding problem, and that molecular dynamics simulation of fragments can be useful for protein structure prediction and design.
- Subjects :
- Models, Molecular
Proteomics
Protein Folding
Computer science
Protein Conformation
Biophysics/Protein Folding
computer.software_genre
Computational Biology/Molecular Dynamics
Force field (chemistry)
Mathematical Sciences
Bayes' theorem
Molecular dynamics
Protein structure
Models
Search problem
lcsh:QH301-705.5
Ecology
Protein structure prediction
Weights and Measures
Biological Sciences
Computational Theory and Mathematics
Modeling and Simulation
Thermodynamics
Protein folding
Biological system
Research Article
Bioinformatics
1.1 Normal biological development and functioning
Chemical
Bioengineering
Computational Biology/Protein Structure Prediction
Machine learning
Cellular and Molecular Neuroscience
Naive Bayes classifier
Artificial Intelligence
Underpinning research
Information and Computing Sciences
Genetics
Computer Simulation
Molecular Biology
Ecology, Evolution, Behavior and Systematics
business.industry
Water
Proteins
Molecular
Bayes Theorem
Logistic Models
Models, Chemical
lcsh:Biology (General)
Solvents
Artificial intelligence
business
computer
Subjects
Details
- Database :
- OpenAIRE
- Journal :
- PLoS computational biology, vol 5, iss 2, PLoS Computational Biology, Vol 5, Iss 2, p e1000281 (2009), PLoS Computational Biology
- Accession number :
- edsair.doi.dedup.....73b22dcc2ab9f15bc4bb9e37e2638256