Optimizing data transmission and access of the incremental clustering algorithm using CUDA: A case study.

Authors :: Chen, Chunlei
Wang, Chengduan
Hou, Jinkui
Zhang, Peng
Zhang, Yonghui
Wang, Lei
Dai, Jiangyan
Source :: Journal of Computational Methods in Sciences & Engineering. 2018, Vol. 18 Issue 4, p989-1005. 17p.
Publication Year :: 2018
Abstract: Incremental clustering algorithms can find wide applications in real-time streaming data processing and massive data analysis. Such algorithms need to continuously load data, and thus data transmission and access can induce non-negligible time overhead. Additionally, we have proposed two algorithms to exploit high data parallelism for incremental clustering on CUDA-enabled GPGPU: the Top-down (TD) algorithm and Moderate-granularity (MG) algorithm. In this paper, we adopt TD and MG algorithms as a case study to optimize data transmission and access based on CUDA. First, we reinterpret the two algorithms in the point view of overlapping read/write and computing operations on CUDA-warp level. Second, we adjust the flow of TD and MG algorithms to enhance data locality. As a result, shared memory can be sufficiently utilized. Third, we reorder input data points to raise data rate of global memory through coalesced memory access. Fourth, we hide part of data transmission latency by running multiple CUDA streams. Experiment results validated the efficiency of our optimizations. [ABSTRACT FROM AUTHOR]

Subjects :: *DATA transmission systems
*MATHEMATICAL optimization
*CUDA (Computer architecture)
*CLUSTER analysis (Statistics)
*COMPUTER algorithms

Full Text Access

Tools