Back to Search Start Over

A MapReduce solution for incremental mining of sequential patterns from big data.

Authors :
Saleti, Sumalatha
R.B.V., Subramanyam
Source :
Expert Systems with Applications. Nov2019, Vol. 133, p109-125. 17p.
Publication Year :
2019

Abstract

• Two phase MapReduce algorithm is proposed for incremental mining of sequential patterns. • Backward mining makes use of the knowledge obtained during the previous mining process. • Co-occurrence reverse map data structure efficiently generates the candidate sequences. • Candidate generation rules avoids the generation of too many false candidates. • Three novel early prune properties are introduced based on the study of item co-occurrences. Sequential Pattern Mining (SPM) is a popular data mining task with broad applications. With the advent of big data, traditional SPM algorithms are not scalable. Hence, many of the researchers have migrated to big data frameworks such as MapReduce and proposed distributed algorithms. However, the existing MapReduce algorithms assume the data as static and do not handle the incremental database updates. Moreover, they use to re-mine the updated database while new sequences are inserted. In this paper, we propose an efficient distributed algorithm for incremental sequential pattern mining (MR-INCSPM) using the MapReduce framework that can handle big data. The proposed algorithm incorporates the backward mining approach that efficiently makes use of the knowledge obtained during the previous mining process. Also, based on the study of item co-occurrences, we propose Co-occurrence Reverse Map (CRMAP) data structure. The issue of combinatorial explosion of candidate sequences is dealt using the proposed CRMAP data structure. Besides, a novel candidate generation and early prune mechanisms are designed using CRMAP to speed up the mining process. The proposed algorithm is evaluated on both the real and synthetic datasets. The experimental results prove the efficacy of MR-INCSPM with respect to processing time, memory and pruning efficiency. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
09574174
Volume :
133
Database :
Academic Search Index
Journal :
Expert Systems with Applications
Publication Type :
Academic Journal
Accession number :
136911830
Full Text :
https://doi.org/10.1016/j.eswa.2019.05.013