1. A Comparative Analysis of Ensemble Based Machine Learning Techniques for Diabetes Identification
- Author
-
Ishrak Mahmud, Nahid Hossain Taz, and Abrar Islam
- Subjects
business.industry ,Computer science ,media_common.quotation_subject ,education ,Machine learning ,computer.software_genre ,Random forest ,Identification (information) ,Robustness (computer science) ,Voting ,Classifier (linguistics) ,Artificial intelligence ,AdaBoost ,business ,F1 score ,Pima indian diabetes ,computer ,media_common - Abstract
This manuscript develops a framework to deliver prediction of diabetes using the Ensemble-based Machine Learning techniques on Pima Indian Diabetes Dataset (PIDD). We investigate the performance measures of four distinct Machine Learning classifiers separately following by a soft voting classifier in order to enhance the performance metrics of Diabetes Identification. The classifications are estimated according to average classification accuracy, precision, recall, f1 score and ROC AUC of individual classifier models. LightGBM, XGBoost, AdaBoost and Random Forest classifiers are employed as distinct classifiers, and a Soft Voting classifier is further introduced to improve the robustness of performance metrics. Analyzed performance metrics results confirm that LightGBM has produced the optimum result as an individual classifier with accuracy score 94% and ROC AUC 95%, and Soft Voting classifier has additionally improved the complete structure to accuracy score 95% and ROC AUC 96%.
- Published
- 2021
- Full Text
- View/download PDF