1. MusiteDeep: a deep-learning framework for general and kinase-specific phosphorylation site prediction
- Author
-
Shuai Zeng, Dong Xu, Trupti Joshi, Yanchun Liang, Chunhui Xu, Duolin Wang, and Wang-Ren Qiu
- Subjects
0301 basic medicine ,Statistics and Probability ,Phosphorylation sites ,Sequence analysis ,Computer science ,Feature extraction ,Machine learning ,computer.software_genre ,Biochemistry ,Convolutional neural network ,Machine Learning ,03 medical and health sciences ,Sequence Analysis, Protein ,Protein methods ,Phosphorylation ,Molecular Biology ,030102 biochemistry & molecular biology ,Artificial neural network ,Kinase ,business.industry ,Deep learning ,Proteins ,Phosphoproteins ,Original Papers ,Computer Science Applications ,Computational Mathematics ,030104 developmental biology ,Computational Theory and Mathematics ,Proteins metabolism ,Neural Networks, Computer ,Artificial intelligence ,business ,Protein Kinases ,computer ,Software - Abstract
Motivation Computational methods for phosphorylation site prediction play important roles in protein function studies and experimental design. Most existing methods are based on feature extraction, which may result in incomplete or biased features. Deep learning as the cutting-edge machine learning method has the ability to automatically discover complex representations of phosphorylation patterns from the raw sequences, and hence it provides a powerful tool for improvement of phosphorylation site prediction. Results We present MusiteDeep, the first deep-learning framework for predicting general and kinase-specific phosphorylation sites. MusiteDeep takes raw sequence data as input and uses convolutional neural networks with a novel two-dimensional attention mechanism. It achieves over a 50% relative improvement in the area under the precision-recall curve in general phosphorylation site prediction and obtains competitive results in kinase-specific prediction compared to other well-known tools on the benchmark data. Availability and implementation MusiteDeep is provided as an open-source tool available at https://github.com/duolinwang/MusiteDeep. Supplementary information Supplementary data are available at Bioinformatics online.
- Published
- 2017