1. Diagnosis of gastric cancer based on hybrid genes selection approach.
- Author
-
Liu, Jie, Cheng, Zhong, Zhang, Jiamin, Liu, Kejun, and Liu, Mengjie
- Abstract
Gastric cancer (GC) is the third leading cause of cancer death worldwide. In the field of medicine, machine learning is widely used in genetic data mining and the construction of diagnostic models. This study proposed an intelligent model DERFS-XGBoost for rapid and accurate diagnosis of GC based on gene expression data. Firstly, the data of GC were collected and preprocessed. Secondly, ANOVA, t-test and fold chang (FC) were used to select genes that had significant differentially expressed genes (DEGs), and random forest (RF) was used to calculate their importance, and then sequential forward selection (SFS) was used to obtain the optimal feature subset. Finally, XGBoost was used for classification after synthetic minority oversampling technique (SMOTE) balanced between tumor and normal samples. In order to objectively evaluate the results, the 10-fold cross-validation and 10 repeated experiments were used in the experiment, and the average value of the evaluation indexes was used to evaluate the classification effect. Based on the experiment, DERFS-XGBoost model accuracy rate was 97.6%, precision was 100%, the recall rate was 97.3%, F1 was 99%, and the area under the ROC receiver operating characteristic curve AUC was 98.7%. The DERFS-XGBoost model has new characteristics which are different from existing diagnostic models, and has achieved a high classification effect with a small number of genes in comparison tests, which provides a new method and basis for the diagnosis of GC. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF