1. DLCSS: A new similarity measure for time series data mining.
- Author
-
Soleimani, Gholamreza and Abessi, Masoud
- Subjects
- *
DATA mining , *SCIENTIFIC computing , *DATABASES , *TIME series analysis , *NEAREST neighbor analysis (Statistics) - Abstract
The Longest Common Subsequence (LCSS) is considered as a classic problem in computer science. In most studies related to time series data mining, LCSS had been mentioned as the best and the most usable similarity measurement method. The results of time series data mining under LCSS strongly depend on the similarity threshold, because the similarity measurement approach in LCSS is a zero–one approach. Since there is no knowledge about the data, and it is very difficult to determine the right amount of similarity threshold, using LCSS can actually lead to poor results. In this research, a new similarity measurement method named Developed Longest Common Subsequence (DLCSS) has been suggested for time series data mining based on LCSS. In DLCSS, by defining two similarity thresholds and determining their values, LCSS' shortcoming was eliminated. The performance of DLCSS was compared with performance of LCSS and Dynamic Time Warping (DTW) using 1-Nearest neighbor and k-medoids clustering techniques. This evaluation was carried out on 63 time series datasets of UCR collection. Using these results, it could be claimed that the 1-NN accuracy and clustering accuracy under DLCSS is better than that of under LCSS and DTW with at least 99.5% and 99% confidence, respectively. Also, DLCSS has better effect in correctly predicting the number of clusters compared to LCSS and DTW. In addition, the effect of DLCSS in determining the better cluster representatives is greater than that of under LCSS and DTW with at least 99.95% confidence. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF