Back to Search Start Over

Comparison of Feature Selection Methods in Breast Cancer Microarray Data.

Authors :
Agraz, Melih
Source :
Medical Records; 2023, Vol. 5 Issue 2, p284-289, 6p
Publication Year :
2023

Abstract

Aim: We aim to predict metastasis in breast cancer patients with tree-based conventional machine learning algorithms and to observe which feature selection methods is more effective in machine learning methods related to microarray breast cancer data reducing the number of features. Material and Methods: Feature selection methods, least squares absolute shrinkage (LASSO), Boruta and maximum relevanceminimum redundancy (MRMR) and statistical preprocessing steps were first applied before the tree-based learning conventional machine learning methods like Decision-tree, Extremely randomized trees and Gradient Boosting Tree applied on the microarray breast cancer data. Results: Microarray data with 54675 features (202 (101/101 breast cancer patients with/without metastases)) was first reduced to 235 features, then the feature selection algorithms were applied and the most important features were found with tree-based machine learning algorithms. It was observed that the highest recall and F-measure values were obtained from the XGBoost method and the highest precision value was received from the Extra-tree method. The 10 arrays out of 54675 with the highest variable importance were listed. Conclusion: The most accurate results were obtained from the statistical preprocessed data for the XGBoost and Extra-trees machine learning algorithms. Statistical and microarray preprocessing steps would be enough in machine learning analysis of microarray data in breast cancer metastases predictions. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
26874555
Volume :
5
Issue :
2
Database :
Complementary Index
Journal :
Medical Records
Publication Type :
Academic Journal
Accession number :
164123498
Full Text :
https://doi.org/10.37990/medr.1202671