Back to Search
Start Over
Thread-Level Locking for SIMT Architectures
- Source :
- IEEE Transactions on Parallel and Distributed Systems. 31:1121-1136
- Publication Year :
- 2020
- Publisher :
- Institute of Electrical and Electronics Engineers (IEEE), 2020.
-
Abstract
- As more emerging applications are moving to GPUs, thread-level synchronization has become a requirement. However, GPUs only provide warp-level and thread-block-level rather than thread-level synchronization. Moreover, it is highly possible to cause live-locks by using CPU synchronization mechanisms to implement thread-level synchronization for GPUs. In this article, we first propose a software-based thread-level synchronization mechanism called lock stealing for GPUs to avoid live-locks. We then describe how to implement our lock stealing algorithm in mutual exclusive locks and readers-writer locks with high performance. Finally, by putting it all together, we develop a thread-level locking library (TLLL) for commercial GPUs. To evaluate TLLL and show its general applicability, we use it to implement six widely used programs. We compare TLLL against the state-of-the-art ad-hoc GPU synchronization, GPU software transactional memory (STM), and CPU hardware transactional memory (HTM), respectively. The results show that, compared with the ad-hoc GPU synchronization for Delaunay mesh refinement (DMR), TLLL improves the performance by 22 percent on average on a GTX970 GPU, and shows up to 11 percent of performance improvement on a Volta V100 GPU. Moreover, it significantly reduces the required memory size. Such low memory consumption enables DMR to successfully run on the GTX970 GPU with the 10-million mesh size, and the V100 GPU with the 40-million mesh size, with which the ad-hoc synchronization can not run successfully. In addition, TLLL outperforms the GPU STM by 65 percent, and the CPU HTM (running on a Xeon E5-2620 v4 CPU with 16 hardware threads) by 43 percent on average.
- Subjects :
- Xeon
Computer science
business.industry
Transactional memory
Thread (computing)
Parallel computing
Software_PROGRAMMINGTECHNIQUES
Lock (computer science)
Synchronization
Instruction set
Software
Computational Theory and Mathematics
Hardware and Architecture
Signal Processing
Software transactional memory
Central processing unit
business
ComputingMethodologies_COMPUTERGRAPHICS
Subjects
Details
- ISSN :
- 21619883 and 10459219
- Volume :
- 31
- Database :
- OpenAIRE
- Journal :
- IEEE Transactions on Parallel and Distributed Systems
- Accession number :
- edsair.doi...........f705f914de02d1a2d16fc11018a07b58
- Full Text :
- https://doi.org/10.1109/tpds.2019.2955705