Back to Search Start Over

Feature selection via minimizing global redundancy for imbalanced data

Authors :
Chuan Luo
Hongmei Chen
Hao Chen
Tianrui Li
Shuhao Huang
Source :
Applied Intelligence. 52:8685-8707
Publication Year :
2021
Publisher :
Springer Science and Business Media LLC, 2021.

Abstract

Mining knowledge from imbalanced data is challenging due to the uneven distribution of classes and increasing dimensionality of data accumulated from real-life applications. Selecting informative features from imbalanced data is especially important for building an effective learning method. The global redundancy and the effect of imbalanced distribution need to be considered simultaneously. In this study, a feature selection method that considers the imbalanced distribution of classes in data is investigated by embedding the weighted constraint on the majority class into the global redundancy minimization GRM framework. Global redundancy minimization is acquired through an objective function that contains a feature redundancy matrix and feature scores. A new form of regularization to a within-class scatter matrix is first presented, which emphasizes the minority class and replaces the redundancy measurement approach. Then, after employing this new form of a within-class scatter matrix in GRM and taking the between-class distance as the GRM input score, a GRM-based discriminant feature selection algorithm (GRM-DFS) is proposed. Comparison studies on a within-class scatter matrix with different forms of regularization indicate that the proposed form of a within-class scatter matrix is effective when dealing with imbalanced data. Experiments on public imbalanced datasets are performed. The experimental results indicate that GRM-DFS is effective.

Details

ISSN :
15737497 and 0924669X
Volume :
52
Database :
OpenAIRE
Journal :
Applied Intelligence
Accession number :
edsair.doi...........80d9277c14e64328ca67b22f26ede876
Full Text :
https://doi.org/10.1007/s10489-021-02855-9