Back to Search Start Over

An efficient Incremental Wrapper-based Information Gain Gene Subset Selection (IG based on IWSSr) method for Tumor Discernment.

Authors :
Fatima, Alia
Nazir, Tahira
Nazir, Aiman Khan
Din, Atta Mohyi U.
Source :
Multimedia Tools & Applications; Jul2024, Vol. 83 Issue 24, p64741-64766, 26p
Publication Year :
2024

Abstract

Tumor is one of the deadliest diseases; nowadays the cases of tumors are increasing rapidly. Researchers from worldwide are doing extensive research for the diagnosis and discernment of tumors by employing machine learning algorithms and performing experiments on the basis of observations which are stored in the form of datasets. The tumor-related dataset is high-dimensional and has many genes, most of which are not prognostic. Some of them are irrelevant and redundant. Here, we proposed a methodology named IG based on IWSSr-Random Forest(RF) which selects the most relevant prognostic genes by using Information Gain for gene ranking and evaluates the importance of genes by using RF after selecting the genes in an incremental manner in a wrapper. Furthermore, we use the RF for classification purposes. Experiments are performed on nine publicly available tumor-related datasets. Accuracy, Confusion matrix, Precision, F-measure, and Recall are used as performance evaluators. The proposed methodology selects 3 most relevant genes out of 2000 genes, 5 genes out of 7129 genes, 3 genes out of 7129 genes, 5 genes out of 24,481 genes, 7 genes out of 12,601 genes, 5 genes out of 15,154 genes, 2 genes out of 4026 genes, 5 genes out of 12,582 genes and 4 genes out of 2308 genes, and produces 88.71%, 71.67%, 98.61%, 79.38%, 93.60%, 99.60%, 92.42%, 95.83% and 92.77% accurate results in case of Colon, Central Nervous System, Leukemia, Breast Cancer, Lung Cancer, Ovarian Cancer, Lymphoma, MLL and SRBCT respectively. Experimental results present that IG based on IWSSr(RF) performs well compared to the state-of-the-art algorithms' results for instance RF, Naïve Bayes, KNN, and Decision Tree. IG based on IWSSr(RF) also has nominal time complexity compared to the time complexity of the above-mentioned classification algorithms. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
13807501
Volume :
83
Issue :
24
Database :
Complementary Index
Journal :
Multimedia Tools & Applications
Publication Type :
Academic Journal
Accession number :
178996665
Full Text :
https://doi.org/10.1007/s11042-023-18046-2