Back to Search Start Over

Evaluating and comparing bagging and boosting of hybrid learning for breast cancer screening

Authors :
Asma Zizaan
Ali Idri
Source :
Scientific African, Vol 23, Iss , Pp e01989- (2024)
Publication Year :
2024
Publisher :
Elsevier, 2024.

Abstract

Purpose: In many fields, including bioinformatics, ensemble methods have been often employed to perform prediction tasks such as classification and regression due to their ability to improve the accuracy and robustness of machine learning models. Given that the most prevalent cancer and the primary cause of death for women is breast cancer, researchers have started using ensemble approaches to advance their studies in this area. Effectively, ensemble methods can be used to develop clinical decision support systems that can help healthcare providers make more accurate judgements. Methods: In the present study, we construct and evaluate bagging and boosting ensemble methods for the binary classification of breast cancer screening images from the CBIS-DDSM dataset. To build the bagging ensembles, a hybrid architecture consisting of three feature extractors (Inception V3, MobileNet V2, DenseNet 201) and four classifiers (K-nearest neighbors, Multilayer perceptron, Support vector machine, Decision trees) was chosen. As for the boosting ensembles, the method consists of the same three feature extractors with a decision trees-based classifier and the four boosting methods (AdaBoost, GBM, XGboost, LightGBM). Results: The evaluation of both ensemble methods over the mammography dataset was made using four performance metrics (accuracy, sensitivity, F1-score, and recall), as well as the Skott–Knott statistical test and the Borda count voting system to rank all the possible model combinations. Results proved that bagging and boosting ensembles generally outperformed both their base learners and single models, and that the best performing ensemble overall was Gradient boosting with DenseNet 201 as feature extractor and 200 trees which scored an accuracy of 85.50 %. Conclusions: The purpose of this research was to use hybrid learning based ensembles to maximize the potent performance of numerous models. It effectively illustrated the prospective potential of using ML approaches for the binary classification of mammography data used for breast cancer screening.

Details

Language :
English
ISSN :
24682276
Volume :
23
Issue :
e01989-
Database :
Directory of Open Access Journals
Journal :
Scientific African
Publication Type :
Academic Journal
Accession number :
edsdoj.bc562ee48de4d8a9dc761f187dd5ea3
Document Type :
article
Full Text :
https://doi.org/10.1016/j.sciaf.2023.e01989