Back to Search Start Over

Prediction of hot spots towards drug discovery by protein sequence embedding with 1D convolutional neural network.

Authors :
Zhang, Youzhi
Yao, Sijie
Chen, Peng
Source :
PLoS ONE. 9/18/2023, Vol. 18 Issue 9, p1-16. 16p.
Publication Year :
2023

Abstract

Protein hotspot residues are key sites that mediate protein-protein interactions. Accurate identification of these residues is essential for understanding the mechanism from protein to function and for designing drug targets. Current research has mostly focused on using machine learning methods to predict hot spots from known interface residues, which artificially extract the corresponding features of amino acid residues from sequence, structure, evolution, energy, and other information to train and test machine learning models. The process is cumbersome, time-consuming and laborious to some extent. This paper proposes a novel idea that develops a pre-trained protein sequence embedding model combined with a one-dimensional convolutional neural network, called Embed-1dCNN, to predict protein hotspot residues. In order to obtain large data samples, this work integrates and extracts data from the datasets of ASEdb, BID, SKEMPI and dbMPIKT to generate a new dataset, and adopts the SMOTE algorithm to expand positive samples to form the training set. The experimental results show that the method achieves an F1 score of 0.82 on the test set. Compared with other hot spot prediction methods, our model achieved better prediction performance. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
19326203
Volume :
18
Issue :
9
Database :
Academic Search Index
Journal :
PLoS ONE
Publication Type :
Academic Journal
Accession number :
172004779
Full Text :
https://doi.org/10.1371/journal.pone.0290899