Back to Search Start Over

Design and performance evaluation of UCX for the Tofu Interconnect D on Fugaku towards efficient multithreaded communication.

Authors :
Watanabe, Yutaka
Tsuji, Miwako
Murai, Hitoshi
Boku, Taisuke
Sato, Mitsuhisa
Source :
Journal of Supercomputing. Sep2024, Vol. 80 Issue 14, p20715-20742. 28p.
Publication Year :
2024

Abstract

The increasing trend of manycore processors makes multithreaded communication more important to avoid costly global synchronization among cores. One of the representative approaches that require multithreaded communication is the global task-based programming model. In the model, a program is divided into tasks, and tasks are asynchronously executed by each node, and independent thread-to-thread communications are expected. However, the Message passing interface (MPI) based approach is not efficient because of design issues. In this research, we design and implement the utofu transport layer in an abstracted communication library called Unified communication-X (UCX) for efficient remote direct memory access (RDMA) based multithreaded communication on Tofu Interconnect D. The evaluation results on Fugaku show that UCX can significantly improve the multithreaded performance over MPI, while maintaining portability between systems thanks to UCX. UCX shows about 32.8 times lower latency than Fujitsu MPI with 24 threads in the multithreaded pingpong benchmark and about 37.8 times higher update rate than Fujitsu MPI with 24 threads on 256 nodes in multithreaded GUPs benchmark. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
09208542
Volume :
80
Issue :
14
Database :
Academic Search Index
Journal :
Journal of Supercomputing
Publication Type :
Academic Journal
Accession number :
178806505
Full Text :
https://doi.org/10.1007/s11227-024-06201-x