1. 基于存储改进的分区并行关联规则挖掘算法.
- Author
-
王永贵, 谢南, and 曲海成
- Subjects
- *
ASSOCIATION rule mining , *BIG data , *SCALABILITY , *SPEED - Abstract
In order to further improve the speed of the association rules mining frequent sets and optimize the execution performance of the algorithm, this paper proposed an association rule mining algorithm based on improved memory structure. Based on the Spark distributed framework, the proposed algorithm mined frequent sets in parallel. It used the Bloom filter to store the project in the mining process, and simplified the operation of the transaction set and the candidate set, so as to optimize the speed of mining frequent sets and save the computing resources. Compared with the YAFIM and the MR-Apriori algorithm, the proposed algorithm has a significant improvement in the efficiency of mining frequent sets under the condition of occupying less memory. The algorithm can not only improve the mining speed and reduce the memory pressure, but also has good scalability, so that the algorithm can be applied to larger data sets and clusters to optimize the performance. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF