1. On the Mining of the Minimal Set of Time Series Data Shapelets
- Author
-
Soukaina Filali Boubrahimi, Ruizhe Ma, Rafal A. Angryk, and Shah Muhammad Hamdi
- Subjects
Artificial neural network ,Computer science ,business.industry ,Big data ,02 engineering and technology ,Machine learning ,computer.software_genre ,Convolutional neural network ,Set (abstract data type) ,Discriminative model ,020204 information systems ,Classifier (linguistics) ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Pruning (decision trees) ,Artificial intelligence ,Time series ,business ,computer - Abstract
Shapelets, also known as motifs, are time series sequences that have the property of discriminating between time series classes. Lately, shapelets studies have gained a lot of momentum due to their interpretable nature. As opposed to traditional time series classifiers, shapelet-based learners provide a visual representation of the pattern that triggers the classification decision. One of the most challenging issues of shapelet-based classifiers is the generation of a large number of shapelet outputs. To the best of our knowledge, this is the first effort that addresses the high numerosity problem of mined shapelets issue by mining the minimal set of discriminative shapelets for time series data. We propose a new shapelet mining learner, 1DCNN, that has the property of learning shapelets of different lengths using a black-box neural network model. 1DCNN optimizes the entire classification schema by learning the shapes of the representative patterns. Our proposed model uses network pruning to sparsify the network and keep only the most discriminative shapelets without compromising the classification accuracy. We validated our model using 59 real-world time series datasets from the UCR repository. Our experimental results show the effectiveness and efficiency of our approach in comparison with other competing baselines models. For fairness purposes, we did not compare 1DCNN with ensemble based approaches that encapsulates many learners. Our results show that the performance of our model is superior to all other baselines pertaining to the shapelet-based classifier category, with up to 95% less Floating Points Operations per Second (FLOPs) required by the network.
- Published
- 2020
- Full Text
- View/download PDF