Back to Search Start Over

PROTA: A Robust Tool for Protamine Prediction Using a Hybrid Approach of Machine Learning and Deep Learning

Authors :
Jorge G. Farias
Lisandra Herrera-Belén
Luis Jimenez
Jorge F. Beltrán
Source :
International Journal of Molecular Sciences, Vol 25, Iss 19, p 10267 (2024)
Publication Year :
2024
Publisher :
MDPI AG, 2024.

Abstract

Protamines play a critical role in DNA compaction and stabilization in sperm cells, significantly influencing male fertility and various biotechnological applications. Traditionally, identifying these proteins is a challenging and time-consuming process due to their species-specific variability and complexity. Leveraging advancements in computational biology, we present PROTA, a novel tool that combines machine learning (ML) and deep learning (DL) techniques to predict protamines with high accuracy. For the first time, we integrate Generative Adversarial Networks (GANs) with supervised learning methods to enhance the accuracy and generalizability of protamine prediction. Our methodology evaluated multiple ML models, including Light Gradient-Boosting Machine (LIGHTGBM), Multilayer Perceptron (MLP), Random Forest (RF), eXtreme Gradient Boosting (XGBOOST), k-Nearest Neighbors (KNN), Logistic Regression (LR), Naive Bayes (NB), and Radial Basis Function-Support Vector Machine (RBF-SVM). During ten-fold cross-validation on our training dataset, the MLP model with GAN-augmented data demonstrated superior performance metrics: 0.997 accuracy, 0.997 F1 score, 0.998 precision, 0.997 sensitivity, and 1.0 AUC. In the independent testing phase, this model achieved 0.999 accuracy, 0.999 F1 score, 1.0 precision, 0.999 sensitivity, and 1.0 AUC. These results establish PROTA, accessible via a user-friendly web application. We anticipate that PROTA will be a crucial resource for researchers, enabling the rapid and reliable prediction of protamines, thereby advancing our understanding of their roles in reproductive biology, biotechnology, and medicine.

Details

Language :
English
ISSN :
14220067 and 16616596
Volume :
25
Issue :
19
Database :
Directory of Open Access Journals
Journal :
International Journal of Molecular Sciences
Publication Type :
Academic Journal
Accession number :
edsdoj.347d3ba57af9410ca0f1aab116708b39
Document Type :
article
Full Text :
https://doi.org/10.3390/ijms251910267