Back to Search Start Over

Predicting Drug Resistance in Mycobacterium tuberculosis : A Machine Learning Approach to Genomic Mutation Analysis.

Authors :
Paredes-Gutierrez, Guillermo
Perea-Jacobo, Ricardo
Acosta-Mesa, Héctor-Gabriel
Mezura-Montes, Efren
Morales Reyes, José Luis
Zenteno-Cuevas, Roberto
Guerrero-Chevannier, Miguel-Ángel
Muñiz-Salazar, Raquel
Flores, Dora-Luz
Source :
Diagnostics (2075-4418); Feb2025, Vol. 15 Issue 3, p279, 16p
Publication Year :
2025

Abstract

Background/Objectives: Tuberculosis (TB), caused by Mycobacterium tuberculosis (M. tuberculosis), remains a leading cause of death from infectious diseases globally. The treatment of active TB relies on first- and second-line drugs, however, the emergence of drug resistance poses a significant challenge to global TB control efforts. Recent advances in whole-genome sequencing combined with machine learning have shown promise in predicting drug resistance. This study aimed to evaluate the performance of four machine learning models in classifying resistance to ethambutol, isoniazid, and rifampicin in M. tuberculosis isolates. Methods: Four machine learning models—Extreme Gradient Boosting Classifier (XGBC), Logistic Gradient Boosting Classifier (LGBC), Gradient Boosting Classifier (GBC), and an Artificial Neural Network (ANN)—were trained using a Variant Call Format (VCF) dataset preprocessed by the CRyPTIC consortium. Three datasets were used: the original dataset, a principal component analysis (PCA)-reduced dataset, and a dataset prioritizing significant mutations identified by the XGBC model. The models were trained and tested across these datasets, and their performance was compared using sensitivity, specificity, Precision, F1-scores and Accuracy. Results: All models were applied to the PCA-reduced dataset, while the XGBC model was also evaluated using the mutation-prioritized dataset. The XGBC model trained on the original dataset outperformed the others, achieving sensitivity values of 0.97, 0.90, and 0.94; specificity values of 0.97, 0.99, and 0.96; and F1-scores of 0.93, 0.94, and 0.92 for ethambutol, isoniazid, and rifampicin, respectively. These results demonstrate the superior accuracy of the XGBC model in classifying drug resistance. Conclusions: The study highlights the effectiveness of using a binary representation of mutations to train the XGBC model for predicting resistance and susceptibility to key TB drugs. The XGBC model trained on the original dataset demonstrated the highest performance among the evaluated models, suggesting its potential for clinical application in combating drug-resistant tuberculosis. Further research is needed to validate and expand these findings for broader implementation in TB diagnostics. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
20754418
Volume :
15
Issue :
3
Database :
Complementary Index
Journal :
Diagnostics (2075-4418)
Publication Type :
Academic Journal
Accession number :
182985663
Full Text :
https://doi.org/10.3390/diagnostics15030279