Back to Search Start Over

A lightweight idempotent messaging protocol for faulty networks

Authors :
Jeffrey P. Grossman
Thomas F. Knight
Jeremy Brown
Source :
SPAA
Publication Year :
2002
Publisher :
ACM, 2002.

Abstract

As parallel machines scale to one million nodes and beyond, it becomes increasingly difficult to build a reliable network that is able to guarantee packet delivery. Eventually large systems will need to employ fault-tolerant messaging protocols that afford correct execution in the presence of a lossy network. In this paper we present a lightweight protocol that preserves message idempotence and is easy to implement in hardware. We identify the requirements for a correct implementation of the protocol. Experiments are performed in simulation to determine implementation parameters that optimize performance. We find that an aggressive implementation on a fat tree network results in a slowdown of less than 2x compared to buffered wormhole routing on a fault-free network.

Details

Database :
OpenAIRE
Journal :
Proceedings of the fourteenth annual ACM symposium on Parallel algorithms and architectures
Accession number :
edsair.doi...........de6f896ad4e586083efa414626d3b11a
Full Text :
https://doi.org/10.1145/564870.564912