Back to Search Start Over

Repair Tree: Fast Repair for Single Failure in Erasure-Coded Distributed Storage Systems.

Authors :
Zhang, Huayu
Li, Hui
Li, Shuo-Yen Robert
Source :
IEEE Transactions on Parallel & Distributed Systems. Jun2017, Vol. 28 Issue 6, p1728-1739. 12p.
Publication Year :
2017

Abstract

In order to guarantee data reliability, erasure codes have been used in distributed storage systems. Nevertheless, this mechanism suffers from the repair problem that excess data are needed to repair a single failure, causing both high bandwidth consuming for the network and heavy computing load on the replacement node. To reduce repair traffic, researchers pointed out the tradeoff between storage and repair traffic and proposed regenerating codes by combining network coding. However, the combination only focuses on the storage terminal and the construction of the codes is quite complicated. Therefore, this paper further combines network coding with network structure and proposes a repair tree model based on general erasure codes to simplify the repair procedure. By decomposing repair computing and distributing it among the tree nodes, our model can mitigate the computing tension. The performance of repair tree is analyzed and evaluated by preliminary emulation. The result shows it can make about three times faster computing than conventional measure and the repair throughput is doubled if there are network bottlenecks. For proper topology, it can significantly reduce the repair traffic. We present algorithms to generate trees across the network topology. At last, we present the idea of extending repair tree to repair multiple failures. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
10459219
Volume :
28
Issue :
6
Database :
Academic Search Index
Journal :
IEEE Transactions on Parallel & Distributed Systems
Publication Type :
Academic Journal
Accession number :
123209380
Full Text :
https://doi.org/10.1109/TPDS.2016.2628024