Back to Search
Start Over
A novel method for feature selection based on molecular interactive effect network.
- Source :
-
Journal of Pharmaceutical & Biomedical Analysis . Sep2022, Vol. 218, pN.PAG-N.PAG. 1p. - Publication Year :
- 2022
-
Abstract
- Analyzing the biological data by considering the molecule interactions may induce a more accurate identification of disease-related biomarkers. In this study, a novel feature selection method based on molecule (feature) interactive effect network is proposed, denoted as Distance Correlation Gain-Network (DCG-Net). In DCG-Net, DCG is defined to measure the interactive effects between pairwise features with respect to the process of physiological and pathological changes and infer the molecule interactive effect network. DCG index is suitable for discrete random variables and continuous random variables. Then a greedy searching strategy is developed to search the informational modules of the interactive features with high statistical dependence on disease outcome. To evaluate the performance of DCG-Net, it was compared with eight representative feature selection techniques including t -test, ReliefF, SVM-RFE, mRMR, IG-RFE, INDEED, MN-PCC and Dcor-SFS on ten public datasets. The experiment results showed the superior performance of DCG-Net in classification accuracy rate, sensitivity, and specificity for three different classifiers. Subsequently, DCG-Net was employed to analyze a lung adenocarcinoma metabolomics dataset, and the metabolites selected involved in the important pathway and had a better discrimination ability. The experiments demonstrate that DCG can effectively detect the molecular interactions, and incorporation of the molecule interactions is helpful to identify informational biomarkers reflecting the occurrence and development of complex diseases. [Display omitted] • A new method is proposed to extract important information based on feature interactions. • The distance correlation gain suitable for continuous and discrete random variables is defined. • A greedy searching strategy is developed to search the informational modules from the interactive effect network. • Experiments on the public datasets and the application in the metabolomics data showed the validity of the method. • The distance correlation gain is used to explore the interactions between features and construct the molecular network. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISSN :
- 07317085
- Volume :
- 218
- Database :
- Academic Search Index
- Journal :
- Journal of Pharmaceutical & Biomedical Analysis
- Publication Type :
- Academic Journal
- Accession number :
- 157498907
- Full Text :
- https://doi.org/10.1016/j.jpba.2022.114873