Back to Search Start Over

Efficient Pattern-Based Aggregation on Sequence Data.

Authors :
He, Zhian
Wong, Petrie
Kao, Ben
Lo, Eric
Cheng, Reynold
Feng, Ziqiang
Source :
IEEE Transactions on Knowledge & Data Engineering; Feb2017, Vol. 29 Issue 2, p286-299, 14p
Publication Year :
2017

Abstract

A Sequence OLAP (S-OLAP) system provides a platform on which pattern-based aggregate (PBA) queries on a sequence database are evaluated. In its simplest form, a PBA query consists of a pattern template $T$<alternatives><inline-graphic xlink:href="he-ieq1-2618856.gif"/></alternatives> and an aggregate function $F$<alternatives> <inline-graphic xlink:href="he-ieq2-2618856.gif"/></alternatives>. A pattern template is a sequence of variables, each is defined over a domain. Each variable is instantiated with all possible values in its corresponding domain to derive all possible patterns of the template. Sequences are grouped based on the patterns they possess. The answer to a PBA query is a sequence cuboid (s-cuboid), which is a multidimensional array of cells. Each cell is associated with a pattern instantiated from the query's pattern template. The value of each s-cuboid cell is obtained by applying the aggregate function $F$ <alternatives><inline-graphic xlink:href="he-ieq3-2618856.gif"/></alternatives> to the set of data sequences that belong to that cell. Since a pattern template can involve many variables and can be arbitrarily long, the induced s-cuboid for a PBA query can be huge. For most analytical tasks, however, only iceberg cells with very large aggregate values are of interest. This paper proposes an efficient approach to identifying and evaluating iceberg cells of s-cuboids. Experimental results show that our algorithms are orders of magnitude faster than existing approaches. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
10414347
Volume :
29
Issue :
2
Database :
Complementary Index
Journal :
IEEE Transactions on Knowledge & Data Engineering
Publication Type :
Academic Journal
Accession number :
120763969
Full Text :
https://doi.org/10.1109/TKDE.2016.2618856