Back to Search Start Over

Identifying Neuropeptides via Evolutionary and Sequential Based Multi-Perspective Descriptors by Incorporation With Ensemble Classification Strategy

Authors :
Shahid Akbar
Heba G. Mohamed
Hashim Ali
Aamir Saeed
Aftab Ahmed Khan
Sarah Gul
Ashfaq Ahmad
Farman Ali
Yazeed Yasin Ghadi
Muhammad Assam
Source :
IEEE Access, Vol 11, Pp 49024-49034 (2023)
Publication Year :
2023
Publisher :
IEEE, 2023.

Abstract

Neuropeptides (NPs) are a kind of neuromodulator/ neurotransmitter that works as signaling molecules in the central nervous system, and perform major roles in physiological and hormone regulation activities. Recently, machine learning-based therapeutic agents have gained the attention of researchers due to their high and reliable prediction results. However, the unsatisfactory performance of the existing predictors is due to their high execution cost and minimum predictive results. Therefore, the development of a reliable prediction is highly indispensable for scientists to effectively predict NPs. In this study, we presented an automatic and computationally effective model for identifying of NPs. The evolutionary information is formulated using a bigram position-specific scoring matrix (Bi-PSSM) and K-spaced bigram (KSB). Moreover, for noise reduction, a discrete wavelet transform (DWT) is utilized to form Bi-PSSM_DWT and KSB_DWT based high discriminative vectors. In addition, one-hot encoding is also employed to collect sequential features from peptide samples. Finally, a multi-perspective feature set of sequential and embedded evolutionary information is formed. The optimum features are chosen from the extracted features via Shapley Additive exPlanations (SHAP) by evaluating the contribution of the extracted features. The optimal features are trained via six classification models i.e., XGB, ETC, SVM, ADA, FKNN, and LGBM. The predicted labels of these learners are then provided to a genetic algorithm to form an ensemble classification approach. Hence, our model achieved a higher predictive accuracy of 94.47% and 92.55% using training sequences and independent sequences, respectively. Which is $\sim $ 3% highest predictive accuracy than present methods. It is suggested that our presented tool will be beneficial and may execute a substantial role in drug development and research academia. The source code and all datasets are publicly available at https://github.com/shahidawkum/Target-ensC_NP.

Details

Language :
English
ISSN :
21693536
Volume :
11
Database :
Directory of Open Access Journals
Journal :
IEEE Access
Publication Type :
Academic Journal
Accession number :
edsdoj.343e0b4c5a0e45fb9cf62fbe13f70545
Document Type :
article
Full Text :
https://doi.org/10.1109/ACCESS.2023.3274601