1. Efficient implementation of MPI-3 RMA over openFabrics interfaces
- Author
-
Sayantan Sur, Erik Paulson, Hajime Fujita, María Jesús Garzarán, Charles J. Archer, and Chongxiao Cao
- Subjects
MPICH ,ComputerSystemsOrganization_COMPUTERSYSTEMIMPLEMENTATION ,Computer Networks and Communications ,Computer science ,business.industry ,Message Passing Interface ,010103 numerical & computational mathematics ,Software_PROGRAMMINGTECHNIQUES ,01 natural sciences ,Computer Graphics and Computer-Aided Design ,Theoretical Computer Science ,010101 applied mathematics ,Software ,Artificial Intelligence ,Hardware and Architecture ,Embedded system ,Programming paradigm ,0101 mathematics ,business - Abstract
The Message Passing Interface (MPI) standard supports Remote Memory Access (RMA) operations, where a process can read or write memory of another process without requiring the target process to be involved in the communication. This enables new more efficient programming models. This paper describes the RMA design and implementation in MPICH-OFI, an MPICH-based open source implementation of the MPI standard that uses the OpenFabrics Interfaces* (OFI*) to communicate with the underlying network fabric. MPICH-OFI is based on a new communication layer called CH4, which was designed to achieve high performance by minimizing the runtime software overhead and by having an internal API that is well aligned with MPI functions. MPICH-OFI uses the OpenFabrics Interfaces (OFI), a lightweight communication framework to support modern high-speed interconnects. Thanks to CH4 and OFI, MPICH-OFI achieves low latency and high bandwidth for RMA operations. Our experimental results using microbenchmarks show that MPICH-OFI achieves more than 3x better put/get latency and bandwidth than MPICH CH3, 10% better latency than Open MPI and MVAPICH2, and more than 1.7x bandwidth than MVAPICH2 for small messages ( ≤ 4KB), on Intel® Omni-Path Architecture.
- Published
- 2019