Back to Search
Start Over
Feature selection via minimizing global redundancy for imbalanced data
- Source :
- Applied Intelligence. 52:8685-8707
- Publication Year :
- 2021
- Publisher :
- Springer Science and Business Media LLC, 2021.
-
Abstract
- Mining knowledge from imbalanced data is challenging due to the uneven distribution of classes and increasing dimensionality of data accumulated from real-life applications. Selecting informative features from imbalanced data is especially important for building an effective learning method. The global redundancy and the effect of imbalanced distribution need to be considered simultaneously. In this study, a feature selection method that considers the imbalanced distribution of classes in data is investigated by embedding the weighted constraint on the majority class into the global redundancy minimization GRM framework. Global redundancy minimization is acquired through an objective function that contains a feature redundancy matrix and feature scores. A new form of regularization to a within-class scatter matrix is first presented, which emphasizes the minority class and replaces the redundancy measurement approach. Then, after employing this new form of a within-class scatter matrix in GRM and taking the between-class distance as the GRM input score, a GRM-based discriminant feature selection algorithm (GRM-DFS) is proposed. Comparison studies on a within-class scatter matrix with different forms of regularization indicate that the proposed form of a within-class scatter matrix is effective when dealing with imbalanced data. Experiments on public imbalanced datasets are performed. The experimental results indicate that GRM-DFS is effective.
Details
- ISSN :
- 15737497 and 0924669X
- Volume :
- 52
- Database :
- OpenAIRE
- Journal :
- Applied Intelligence
- Accession number :
- edsair.doi...........80d9277c14e64328ca67b22f26ede876
- Full Text :
- https://doi.org/10.1007/s10489-021-02855-9