Back to Search
Start Over
Achieving 80% ten-fold cross-validated accuracy for secondary structure prediction by large-scale training.
- Source :
- Proteins; Mar2007, Vol. 66 Issue 4, p838-845, 8p
- Publication Year :
- 2007
-
Abstract
- An integrated system of neural networks, called SPINE, is established and optimized for predicting structural properties of proteins. SPINE is applied to three-state secondary-structure and residue-solvent-accessibility (RSA) prediction in this paper. The integrated neural networks are carefully trained with a large dataset of 2640 chains, sequence profiles generated from multiple sequence alignment, representative amino acid properties, a slow learning rate, overfitting protection, and an optimized sliding-widow size. More than 200,000 weights in SPINE are optimized by maximizing the accuracy measured by Q<subscript>3</subscript> (the percentage of correctly classified residues). SPINE yields a 10-fold cross-validated accuracy of 79.5% (80.0% for chains of length between 50 and 300) in secondary-structure prediction after one-month (CPU time) training on 22 processors. An accuracy of 87.5% is achieved for exposed residues (RSA >95%). The latter approaches the theoretical upper limit of 88-90% accuracy in assigning secondary structures. An accuracy of 73% for three-state solvent-accessibility prediction (25%/75% cutoff) and 79.3% for two-state prediction (25% cutoff) is also obtained. Proteins 2007. © 2006 Wiley-Liss, Inc. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISSN :
- 08873585
- Volume :
- 66
- Issue :
- 4
- Database :
- Complementary Index
- Journal :
- Proteins
- Publication Type :
- Academic Journal
- Accession number :
- 64228633
- Full Text :
- https://doi.org/10.1002/prot.21298