Back to Search
Start Over
Optimizing small channel 3D convolution on GPU with tensor core.
- Source :
-
Parallel Computing . Oct2022, Vol. 113, pN.PAG-N.PAG. 1p. - Publication Year :
- 2022
-
Abstract
- In many scenarios, particularly scientific AI applications, algorithm engineers widely adopt more complex convolution, e.g. 3D CNN, to improve the accuracy. Scientific AI applications with 3D-CNN, which tends to train with volumetric datasets, substantially increase the size of the input, which in turn potentially restricts the channel sizes (e.g. less than 64) under the constraints of limited device memory capacity. Since existing convolution implementations tend to split and parallelize computing the small channel convolution from channel dimension, they usually cannot fully exploit the performance of GPU accelerator, in particular that configured with the emerging tensor core. In this work, we target on enhancing the performance of small channel 3D convolution on the GPU platform configured with tensor cores. Our analysis shows that the channel size of convolution has a great effect on the performance of existing convolution implementations, that are memory-bound on tensor core. By leveraging the memory hierarchy characteristics and the WMMA API of tensor core, we propose and implement holistic optimizations for both promoting the data access efficiency and intensifying the utilization of computing units. Experiments show that our implementation can obtain 1.1x–5.4x speedup comparing to the cuDNN's implementations for the 3D convolutions on different GPU platforms. We also evaluate our implementations on two practical scientific AI applications and observe up to 1.7x and 2.0x overall speedups compared with using cuDNN on V100 GPU. [ABSTRACT FROM AUTHOR]
- Subjects :
- *GRAPHICS processing units
*COMPUTER storage devices
Subjects
Details
- Language :
- English
- ISSN :
- 01678191
- Volume :
- 113
- Database :
- Academic Search Index
- Journal :
- Parallel Computing
- Publication Type :
- Academic Journal
- Accession number :
- 159329191
- Full Text :
- https://doi.org/10.1016/j.parco.2022.102954