Back to Search Start Over

Effective database transformation and efficient support computation for mining sequential patterns

Authors :
Arbee L. P. Chen
Yi-Hung Wu
Chung-Wen Cho
Source :
Journal of Intelligent Information Systems. 32:23-51
Publication Year :
2007
Publisher :
Springer Science and Business Media LLC, 2007.

Abstract

In this paper, we propose a novel algorithm for mining frequent sequences from transaction databases. The transactions of the same customers form a set of customer sequences. A sequence (an ordered list of itemsets) is frequent if the number of customer sequences containing it satisfies the user-specified threshold. The 1-sequence is a special type of sequences because it consists of only a single itemset instead of an ordered list, while the k-sequence is a sequence composed of k itemsets. Compared with the cost of mining frequent k-sequences (k???2), the cost of mining frequent 1-sequences is negligible. We adopt a two-phase architecture to find the two types of frequent sequences separately in order that the discovery of frequent k-sequences can be well designed and optimized. For efficient frequent k-sequence mining, every frequent 1-sequence is encoded as a unique symbol and the database is transformed into one constituted by the symbols. We find that it is unnecessary to encode all the frequent 1-seqences, and make full use of the discovered frequent 1-sequences to transform the database into one with a smaller size. For every k???2, the customer sequences in the transformed database are scanned to find all the frequent k-sequences. We devise the compact representation for a customer sequence and elaborate the method to enumerate all distinct subsequences from a customer sequence without redundant scans. The soundness of the proposed approach is verified and a number of experiments are performed. The results show that our approach outperforms the previous works in both scalability and execution time.

Details

ISSN :
15737675 and 09259902
Volume :
32
Database :
OpenAIRE
Journal :
Journal of Intelligent Information Systems
Accession number :
edsair.doi...........9d6fb0105b49a6e570eb4b95bd0a47df
Full Text :
https://doi.org/10.1007/s10844-007-0047-y