Back to Search Start Over

Characterizing Performance of Graph Neighborhood Communication Patterns.

Authors :
Ghosh, Sayan
Tallent, Nathan R.
Halappanavar, Mahantesh
Source :
IEEE Transactions on Parallel & Distributed Systems. Apr2022, Vol. 33 Issue 4, p915-928. 14p.
Publication Year :
2022

Abstract

Distributed-memory graph algorithms are fundamental enablers in scientific computing and analytics workflows. A majority of graph algorithms rely on the graph neighborhood communication pattern, i.e., repeated asynchronous communication between a vertex and its neighbors in the graph. The pattern is adversarial for communication software and hardware due to high message injection rates and input-dependent, many-to-one traffic with variable destinations and volumes. We present benchmarks and performance analysis of graph neighborhood communication on modern large-scale network interconnects from four supercomputers: ALCF Theta, NERSC Cori, OLCF Summit and R-CCS Fugaku. Our benchmarks characterize communication from the perspectives of latency and throughput. Benchmark parameters make it possible to mimic the behaviors of complex applications on real world or synthetic graphs by varying work distribution, remote edges, message volume, and per-vertex work. We find that minor changes in the input graph can substantially increase latencies; and contention can develop in memory caches and network stacks before contention in the network itself. Further, latencies and contention vary significantly for different graph neighborhoods, motivating the need for exploring asynchronous algorithms in greater detail. When adding work, load imbalance on real-world graphs can be pronounced: latencies for the 99th percentile were 8–128× than the corresponding average latencies. Our results help analysts and developers understand the performance implications of this important pattern, especially for the impending exascale platforms. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
10459219
Volume :
33
Issue :
4
Database :
Academic Search Index
Journal :
IEEE Transactions on Parallel & Distributed Systems
Publication Type :
Academic Journal
Accession number :
153880618
Full Text :
https://doi.org/10.1109/TPDS.2021.3101425