1. Adaptive Data Structure Regularized Multiclass Discriminative Feature Selection
- Author
-
Nannan Gu, Jie Hu, Dacheng Tao, Mingyu Fan, and Xiaoqin Zhang
- Subjects
Computer Networks and Communications ,Computer science ,business.industry ,Dimensionality reduction ,Pattern recognition ,Feature selection ,02 engineering and technology ,Data structure ,Data type ,Regularization (mathematics) ,Computer Science Applications ,Discriminative model ,Artificial Intelligence ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Pairwise comparison ,Artificial intelligence ,business ,Cluster analysis ,Software - Abstract
Feature selection (FS), which aims to identify the most informative subset of input features, is an important approach to dimensionality reduction. In this article, a novel FS framework is proposed for both unsupervised and semisupervised scenarios. To make efficient use of data distribution to evaluate features, the framework combines data structure learning (as referred to as data distribution modeling) and FS in a unified formulation such that the data structure learning improves the results of FS and vice versa. Moreover, two types of data structures, namely the soft and hard data structures, are learned and used in the proposed FS framework. The soft data structure refers to the pairwise weights among data samples, and the hard data structure refers to the estimated labels obtained from clustering or semisupervised classification. Both of these data structures are naturally formulated as regularization terms in the proposed framework. In the optimization process, the soft and hard data structures are learned from data represented by the selected features, and then, the most informative features are reselected by referring to the data structures. In this way, the framework uses the interactions between data structure learning and FS to select the most discriminative and informative features. Following the proposed framework, a new semisupervised FS (SSFS) method is derived and studied in depth. Experiments on real-world data sets demonstrate the effectiveness of the proposed method.
- Published
- 2022