Back to Search Start Over

Thread-Level Locking for SIMT Architectures

Authors :
Yunlong Xu
Zhibin Yu
Zhongzhi Luan
Rui Wang
Depei Qian
Lan Gao
Source :
IEEE Transactions on Parallel and Distributed Systems. 31:1121-1136
Publication Year :
2020
Publisher :
Institute of Electrical and Electronics Engineers (IEEE), 2020.

Abstract

As more emerging applications are moving to GPUs, thread-level synchronization has become a requirement. However, GPUs only provide warp-level and thread-block-level rather than thread-level synchronization. Moreover, it is highly possible to cause live-locks by using CPU synchronization mechanisms to implement thread-level synchronization for GPUs. In this article, we first propose a software-based thread-level synchronization mechanism called lock stealing for GPUs to avoid live-locks. We then describe how to implement our lock stealing algorithm in mutual exclusive locks and readers-writer locks with high performance. Finally, by putting it all together, we develop a thread-level locking library (TLLL) for commercial GPUs. To evaluate TLLL and show its general applicability, we use it to implement six widely used programs. We compare TLLL against the state-of-the-art ad-hoc GPU synchronization, GPU software transactional memory (STM), and CPU hardware transactional memory (HTM), respectively. The results show that, compared with the ad-hoc GPU synchronization for Delaunay mesh refinement (DMR), TLLL improves the performance by 22 percent on average on a GTX970 GPU, and shows up to 11 percent of performance improvement on a Volta V100 GPU. Moreover, it significantly reduces the required memory size. Such low memory consumption enables DMR to successfully run on the GTX970 GPU with the 10-million mesh size, and the V100 GPU with the 40-million mesh size, with which the ad-hoc synchronization can not run successfully. In addition, TLLL outperforms the GPU STM by 65 percent, and the CPU HTM (running on a Xeon E5-2620 v4 CPU with 16 hardware threads) by 43 percent on average.

Details

ISSN :
21619883 and 10459219
Volume :
31
Database :
OpenAIRE
Journal :
IEEE Transactions on Parallel and Distributed Systems
Accession number :
edsair.doi...........f705f914de02d1a2d16fc11018a07b58
Full Text :
https://doi.org/10.1109/tpds.2019.2955705