Back to Search Start Over

Fault-Tolerant Distributed Shared Memory on a Broadcast-Based Architecture.

Authors :
Katsinis, Constantine
Hecht, Diana
Source :
IEEE Transactions on Parallel & Distributed Systems. Dec2004, Vol. 15 Issue 12, p1082-1092. 11p.
Publication Year :
2004

Abstract

Due to advances in fiber-optics and VLSI technology, interconnection networks that allow multiple simultaneous broadcasts are becoming feasible. Distributed-shared-memory implementations on such networks promise high performance even for applications with small granularity. This paper presents the architecture of one such implementation, called the Simultaneous Optical Multiprocessor Exchange Bus, and examines the performance of augmented DSM protocols that exploit the natural duplication of data to maintain a recovery memory in each processing node and provide basic fault tolerance. Simulation results show that the additional data duplication necessary to create fault-tolerant DSM causes no reduction in system performance during normal operation and eliminates most of the overhead at checkpoint creation. Under certain conditions, data blocks that are duplicated to maintain the recovery memory are utilized by the underlying DSM protocol, reducing network traffic, and increasing the processor utilization significantly. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
10459219
Volume :
15
Issue :
12
Database :
Academic Search Index
Journal :
IEEE Transactions on Parallel & Distributed Systems
Publication Type :
Academic Journal
Accession number :
15302654
Full Text :
https://doi.org/10.1109/TPDS.2004.83