Back to Search
Start Over
Folding non-homologous proteins by coupling deep-learning contact maps with I-TASSER assembly simulations
- Source :
- Cell Reports: Methods, Vol 1, Iss 3, Pp 100014- (2021)
- Publication Year :
- 2021
- Publisher :
- Elsevier, 2021.
-
Abstract
- Summary: Structure prediction for proteins lacking homologous templates in the Protein Data Bank (PDB) remains a significant unsolved problem. We developed a protocol, C-I-TASSER, to integrate interresidue contact maps from deep neural-network learning with the cutting-edge I-TASSER fragment assembly simulations. Large-scale benchmark tests showed that C-I-TASSER can fold more than twice the number of non-homologous proteins than the I-TASSER, which does not use contacts. When applied to a folding experiment on 8,266 unsolved Pfam families, C-I-TASSER successfully folded 4,162 domain families, including 504 folds that are not found in the PDB. Furthermore, it created correct folds for 85% of proteins in the SARS-CoV-2 genome, despite the quick mutation rate of the virus and sparse sequence profiles. The results demonstrated the critical importance of coupling whole-genome and metagenome-based evolutionary information with optimal structure assembly simulations for solving the problem of non-homologous protein structure prediction. Motivation: Taking advantage of the rapid progress in deep-learning technologies, residue-residue contact-map prediction recently achieved impressive breakthroughs. However, how to efficiently convert the binary contact maps into atomic-level structure models remains an important unsolved problem in ab initio protein structure prediction. In this work, we integrated the deep-learning contact-map predictions with cutting-edge threading assembly simulations and found that the inherent force field of the structural folding simulations is essential to maximize the potential of contact-assisted protein structure prediction, especially for the targets and regions that lack spatial restraints and sufficient evolutionary data.
Details
- Language :
- English
- ISSN :
- 26672375
- Volume :
- 1
- Issue :
- 3
- Database :
- Directory of Open Access Journals
- Journal :
- Cell Reports: Methods
- Publication Type :
- Academic Journal
- Accession number :
- edsdoj.19208e6ef2d946f1a6716198840c2767
- Document Type :
- article
- Full Text :
- https://doi.org/10.1016/j.crmeth.2021.100014