Back to Search Start Over

Predicting protein sumoylation sites from sequence features

Authors :
Hong Luo
Liangjiang Wang
Shaolei Teng
Source :
Amino Acids. 43:447-455
Publication Year :
2011
Publisher :
Springer Science and Business Media LLC, 2011.

Abstract

Protein sumoylation is a post-translational modification that plays an important role in a wide range of cellular processes. Small ubiquitin-related modifier (SUMO) can be covalently and reversibly conjugated to the sumoylation sites of target proteins, many of which are implicated in various human genetic disorders. The accurate prediction of protein sumoylation sites may help biomedical researchers to design their experiments and understand the molecular mechanism of protein sumoylation. In this study, a new machine learning approach has been developed for predicting sumoylation sites from protein sequence information. Random forests (RFs) and support vector machines (SVMs) were trained with the data collected from the literature. Domain-specific knowledge in terms of relevant biological features was used for input vector encoding. It was shown that RF classifier performance was affected by the sequence context of sumoylation sites, and 20 residues with the core motif ΨKXE in the middle appeared to provide enough context information for sumoylation site prediction. The RF classifiers were also found to outperform SVM models for predicting protein sumoylation sites from sequence features. The results suggest that the machine learning approach gives rise to more accurate prediction of protein sumoylation sites than the other existing methods. The accurate classifiers have been used to develop a new web server, called seeSUMO (http://bioinfo.ggc.org/seesumo/), for sequence-based prediction of protein sumoylation sites.

Details

ISSN :
14382199 and 09394451
Volume :
43
Database :
OpenAIRE
Journal :
Amino Acids
Accession number :
edsair.doi.dedup.....3d1692c42c67a268f2bd822f50ee3bea
Full Text :
https://doi.org/10.1007/s00726-011-1100-2