Back to Search
Start Over
Deep-FS: A feature selection algorithm for Deep Boltzmann Machines
- Source :
- Neurocomputing. 322:22-37
- Publication Year :
- 2018
- Publisher :
- Elsevier BV, 2018.
-
Abstract
- A Deep Boltzmann Machine is a model of a Deep Neural Network formed from multiple layers of neurons with nonlinear activation functions. The structure of a Deep Boltzmann Machine enables it to learn very complex relationships between features and facilitates advanced performance in learning of high-level representation of features, compared to conventional Artificial Neural Networks. Feature selection at the input level of Deep Neural Networks has not been well studied, despite its importance in reducing the input features processed by the deep learning model, which facilitates understanding of the data. This paper proposes a novel algorithm, Deep Feature Selection (Deep-FS), which is capable of removing irrelevant features from large datasets in order to reduce the number of inputs which are modelled during the learning process. The proposed Deep-FS algorithm utilizes a Deep Boltzmann Machine, and uses knowledge which is acquired during training to remove features at the beginning of the learning process. Reducing inputs is important because it prevents the network from learning the associations between the irrelevant features which negatively impact on the acquired knowledge of the network about the overall distribution of the data. The Deep-FS method embeds feature selection in a Restricted Boltzmann Machine which is used for training a Deep Boltzmann Machine. The generative property of the Restricted Boltzmann Machine is used to reconstruct eliminated features and calculate reconstructed errors, in order to evaluate the impact of eliminating features. The performance of the proposed approach was evaluated with experiments conducted using the MNIST, MIR-Flickr, GISETTE, MADELON and PANCAN datasets. The results revealed that the proposed Deep-FS method enables improved feature selection without loss of accuracy on the MIR-Flickr dataset, where Deep-FS reduced the number of input features by removing 775 features without reduction in performance. With regards to the MNIST dataset, Deep-FS reduced the number of input features by more than 45%; it reduced the network error from 0.97% to 0.90%, and also reduced processing and classification time by more than 5.5%. Additionally, when compared to classical feature selection methods, Deep-FS returned higher accuracy. The experimental results on GISETTE, MADELON and PANCAN showed that Deep-FS reduced 81%, 57% and 77% of the number of input features, respectively. Moreover, the proposed feature selection method reduced the classifier training time by 82%, 70% and 85% on GISETTE, MADELON and PANCAN datasets, respectively. Experiments with various datasets, comprising a large number of features and samples, revealed that the proposed Deep-FS algorithm overcomes the main limitations of classical feature selection algorithms. More specifically, most classical methods require, as a prerequisite, a pre-specified number of features to retain, however in Deep-FS this number is identified automatically. Deep-FS performs the feature selection task faster than classical feature selection algorithms which makes it suitable for deep learning tasks. In addition, Deep-FS is suitable for finding features in large and big datasets which are normally stored in data batches for faster and more efficient processing.
- Subjects :
- 0301 basic medicine
Restricted Boltzmann machine
Artificial neural network
business.industry
Computer science
Cognitive Neuroscience
Deep learning
Boltzmann machine
Feature selection
02 engineering and technology
Computer Science Applications
03 medical and health sciences
symbols.namesake
Nonlinear system
030104 developmental biology
Artificial Intelligence
Boltzmann constant
0202 electrical engineering, electronic engineering, information engineering
symbols
020201 artificial intelligence & image processing
Artificial intelligence
business
Algorithm
Classifier (UML)
MNIST database
Subjects
Details
- ISSN :
- 09252312
- Volume :
- 322
- Database :
- OpenAIRE
- Journal :
- Neurocomputing
- Accession number :
- edsair.doi.dedup.....ecb3220551a9b0ccace7af599261047f