Back to Search
Start Over
CSCIM_FS: Cosine similarity coefficient and information measurement criterion-based feature selection method for high-dimensional data.
- Source :
-
Neurocomputing . Oct2023, Vol. 552, pN.PAG-N.PAG. 1p. - Publication Year :
- 2023
-
Abstract
- [Display omitted] • A cosine transform-based similarity calculation method is proposed to cope with information loss in data discretization when the strategy of mutual information is adopted for feature selection. Feature selection (FS) based on mutual information (MI) metrics needs to discretize the data in preprocessing, which is a convenient way to identify correlation between features. However, information loss often occurs in data discretization. In order to solve this information loss problem, this paper proposes a FS algorithm based on cosine similarity coefficient and information measurement criterion (CSCIM_FS). First, the MI between features and tags is calculated, and features are sorted out according to the MI calculated. Then, a feature matrix is constructed to transform the one-dimensional feature sequence into a two-dimensional square matrix. Next, cosine transform is adopted to obtain the high-frequency components of the feature matrix, and sampling is conducted to derive the hash fingerprint of the feature matrix. After that, the similarity between every two features is calculated on the basis of the hash fingerprints of different features. Finally, the feature weight is calculated according to tags, the MI and similarity between features, and a key feature subset is obtained and used to conduct feature selection from the data. The experimental results on several UCI public datasets show that CSCIM_FS algorithm selected a feature subset with high accuracy, and that this algorithm performs better than MIM, CMIM, mRMR and other algorithms. [ABSTRACT FROM AUTHOR]
- Subjects :
- *FEATURE selection
*INFORMATION measurement
*COSINE transforms
Subjects
Details
- Language :
- English
- ISSN :
- 09252312
- Volume :
- 552
- Database :
- Academic Search Index
- Journal :
- Neurocomputing
- Publication Type :
- Academic Journal
- Accession number :
- 169921857
- Full Text :
- https://doi.org/10.1016/j.neucom.2023.126564