1. A unified pipeline for online feature selection and classification.
- Author
-
Bolon-Canedo, Veronica, Fernández-Francos, Diego, Peteiro-Barral, Diego, Alonso-Betanzos, Amparo, Guijarro-Berdiñas, Bertha, and Sánchez-Maroño, Noelia
- Subjects
- *
CATEGORIES (Mathematics) , *BIG data , *DATA mining , *ALGORITHMS , *MATHEMATICAL analysis - Abstract
With the advent of Big Data, data is being collected at an unprecedented fast pace, and it needs to be processed in a short time. To deal with data streams that flow continuously, classical batch learning algorithms cannot be applied and it is necessary to employ online approaches. Online learning consists of continuously revising and refining a model by incorporating new data as they arrive, and it allows important problems such as concept drift or management of extremely high-dimensional datasets to be solved. In this paper, we present a unified pipeline for online learning which covers online discretization, feature selection and classification. Three classical methods—the k -means discretizer, the χ 2 filter and a one-layer artificial neural network—have been reimplemented to be able to tackle online data, showing promising results on both synthetic and real datasets. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF