Back to Search Start Over

Efficient utilization of launched threads on GPUs: The spherical harmonic transform as a case study.

Authors :
Lu, Feng-shun
Song, Jun-qiang
Lin, Wang-qun
Pang, Yu-fei
Ren, Kai-jun
Shi, Pei-chang
Source :
Computer Physics Communications. Nov2013, Vol. 184 Issue 11, p2494-2502. 9p.
Publication Year :
2013

Abstract

Abstract: Maximum utilization of hardware resources is crucial to leverage the enormous computational power of graphics processing units (GPUs). However, there lacks an effective metric to denote whether the launched threads are kept busy. To address this issue, we propose a metric called ETU to describe the efficiency of threads utilization. First, we execute several CUDA-SDK sample codes, with(out) double precision arithmetic, on two generations of GPUs so as to perform a preliminary validation of the ETU metric. Taking the spherical harmonic transform as an example, we then give two GPU implementations for Legendre transforms and check the relationship between ETU and application performance. Experimental results show that applications with larger ETU can usually achieve better performance, which is more accurate than occupancy proposed by NVIDIA. Finally, we select the GPU implementations with better performance to accelerate Legendre transforms in STSWM, which is a spectral transform shallow water model. [Copyright &y& Elsevier]

Details

Language :
English
ISSN :
00104655
Volume :
184
Issue :
11
Database :
Academic Search Index
Journal :
Computer Physics Communications
Publication Type :
Periodical
Accession number :
89998380
Full Text :
https://doi.org/10.1016/j.cpc.2013.06.019