Back to Search Start Over

MUSE: Minimum Uncertainty and Sample Elimination Based Binary Feature Selection.

Authors :
Zhang, Zisheng
Parhi, Keshab K.
Source :
IEEE Transactions on Knowledge & Data Engineering. Sep2019, Vol. 31 Issue 9, p1750-1764. 15p.
Publication Year :
2019

Abstract

This paper presents a novel incremental feature selection method based on minimum uncertainty and feature sample elimination (referred as MUSE). Feature selection is an important step in machine learning. In an incremental feature selection approach, past approaches have attempted to increase class relevance while simultaneously minimizing redundancy with previously selected features. One example of such an approach is the feature selection method of minimum Redundancy Maximum Relevance (mRMR). The proposed approach differs from prior mRMR approach in how the redundancy of the current feature with previously selected features is reduced. In the proposed approach, the feature samples are divided into a pre-specified number of bins; this step is referred to as feature quantization. A novel uncertainty score for each feature is computed by summing the conditional entropies of the bins, and the feature with the lowest uncertainty score is selected. For each bin, its impurity is computed by taking the minimum of the probability of Class 1 and of Class 2. The feature samples corresponding to the bins with impurities below a threshold are discarded and are not used for selection of the subsequent features. The significance of the MUSE feature selection method is demonstrated using the two datasets: arrhythmia and hand digit recognition (Gisette), and datasets for seizure prediction from five dogs and two humans. It is shown that the proposed method outperforms the prior mRMR feature selection method for most cases. For the arrhythmia dataset, the proposed method achieves 30 percent higher sensitivity at the expense of 7 percent loss of specificity. For the Gisette dataset, the proposed method achieves 15 percent higher accuracy for Class 2, at the expense of 3 percent lower accuracy for Class 1. With respect to seizure prediction among 5 dogs and 2 humans, the proposed method achieves higher area-under-curve (AUC) for all subjects. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
10414347
Volume :
31
Issue :
9
Database :
Academic Search Index
Journal :
IEEE Transactions on Knowledge & Data Engineering
Publication Type :
Academic Journal
Accession number :
137987893
Full Text :
https://doi.org/10.1109/TKDE.2018.2865778