Back to Search
Start Over
Independent Forward Progress of Work-groups
- Source :
- ISCA
- Publication Year :
- 2020
- Publisher :
- IEEE, 2020.
-
Abstract
- GPUs have evolved from providing highly-constrained programmability for a single kernel to using pre-emption to ensure independent forward progress for concurrently executing kernels. However, modern GPUs do not ensure independent forward progress for kernels that use fine-grain synchronization to coordinate inter-work-group execution. Enabling independent forward progress among work-groups (WGs) is challenging as pre-empted kernels may be rescheduled with fewer hardware resources. This can lead to oversubscribed execution scenarios that deadlock current hardware even for correctly written code. Prior work addresses this problem by requiring programmers to specify resource requirements and assuming static resource allocation, which adds scheduling constraints and reduces portability. We propose a family of novel hardware approaches --- trading off hardware complexity for performance --- that provide independent forward progress in the presence of fine-grain inter-WG synchronization and dynamic resource allocation. Additionally, we propose new waiting atomic instructions compatible with proposed C++20 extensions. Our final design, Autonomous WorkGroups (AWG), uses hints from regular and waiting atomics to cooperatively schedule WGs within a kernel, improving efficiency and virtualizing hardware resources. In non-oversubscribed scenarios, AWG outperforms a busy-waiting baseline (which deadlocks in oversubscribed scenarios) by 12x on average for benchmarks that use different mutexes and barriers for fine-grained, WG granularity synchronization. Furthermore, AWG outperforms other solutions that do not deadlock in the oversubscribed case, such as fixed-interval round-robin context switching or naively extending monitor/mwait to GPUs, by 2.6x and 2.2x, respectively.
- Subjects :
- 010302 applied physics
Computer science
Distributed computing
02 engineering and technology
Deadlock
Virtualization
computer.software_genre
01 natural sciences
Synchronization
020202 computer hardware & architecture
Scheduling (computing)
Software portability
Kernel (image processing)
0103 physical sciences
0202 electrical engineering, electronic engineering, information engineering
Granularity
computer
Context switch
Subjects
Details
- Database :
- OpenAIRE
- Journal :
- 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA)
- Accession number :
- edsair.doi...........6d50181b272a8f0d66b99a5e02666428
- Full Text :
- https://doi.org/10.1109/isca45697.2020.00087