Back to Search Start Over

CSCIM_FS: Cosine similarity coefficient and information measurement criterion-based feature selection method for high-dimensional data.

Authors :
Yuan, Gaoteng
Zhai, Yi
Tang, Jiansong
Zhou, Xiaofeng
Source :
Neurocomputing. Oct2023, Vol. 552, pN.PAG-N.PAG. 1p.
Publication Year :
2023

Abstract

[Display omitted] • A cosine transform-based similarity calculation method is proposed to cope with information loss in data discretization when the strategy of mutual information is adopted for feature selection. Feature selection (FS) based on mutual information (MI) metrics needs to discretize the data in preprocessing, which is a convenient way to identify correlation between features. However, information loss often occurs in data discretization. In order to solve this information loss problem, this paper proposes a FS algorithm based on cosine similarity coefficient and information measurement criterion (CSCIM_FS). First, the MI between features and tags is calculated, and features are sorted out according to the MI calculated. Then, a feature matrix is constructed to transform the one-dimensional feature sequence into a two-dimensional square matrix. Next, cosine transform is adopted to obtain the high-frequency components of the feature matrix, and sampling is conducted to derive the hash fingerprint of the feature matrix. After that, the similarity between every two features is calculated on the basis of the hash fingerprints of different features. Finally, the feature weight is calculated according to tags, the MI and similarity between features, and a key feature subset is obtained and used to conduct feature selection from the data. The experimental results on several UCI public datasets show that CSCIM_FS algorithm selected a feature subset with high accuracy, and that this algorithm performs better than MIM, CMIM, mRMR and other algorithms. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
09252312
Volume :
552
Database :
Academic Search Index
Journal :
Neurocomputing
Publication Type :
Academic Journal
Accession number :
169921857
Full Text :
https://doi.org/10.1016/j.neucom.2023.126564