Back to Search Start Over

EAFR: An Energy-Efficient Adaptive File Replication System in Data-Intensive Clusters.

Authors :
Lin, Yuhua
Shen, Haiying
Source :
IEEE Transactions on Parallel & Distributed Systems. Apr2017, Vol. 28 Issue 4, p1017-1030. 14p.
Publication Year :
2017

Abstract

In data intensive clusters, a large amount of files are stored, processed and transferred simultaneously. To increase the data availability, some file systems create and store three replicas for each file in randomly selected servers across different racks. However, they neglect the file heterogeneity and server heterogeneity, which can be leveraged to further enhance data availability and file system efficiency. As files have heterogeneous popularities, a rigid number of three replicas may not provide immediate response to an excessive number of read requests to hot files, and waste resources (including energy) for replicas of cold files that have few read requests. Also, servers are heterogeneous in network bandwidth, hardware configuration and capacity (i.e., the maximal number of service requests that can be supported simultaneously), it is crucial to select replica servers to ensure low replication delay and request response delay. In this paper, we propose an Energy-Efficient Adaptive File Replication System (EAFR), which incorporates three components. It is adaptive to time-varying file popularities to achieve a good tradeoff between data availability and efficiency. Higher popularity of a file leads to more replicas and vice versa. Also, to achieve energy efficiency, servers are classified into hot servers and cold servers with different energy consumption, and cold files are stored in cold servers. EAFR then selects a server with sufficient capacity (including network bandwidth and capacity) to hold a replica. To further improve the performance of EAFR, we propose a dynamic transmission rate adjustment strategy to prevent potential incast congestion when replicating a file to a server, a network-aware data node selection strategy to reduce file read latency, and a load-aware replica maintenance strategy to quickly create file replicas under replica node failures. Experimental results on a real-world cluster show the effectiveness of EAFR and proposed strategies in reducing file read latency, replication time, and power consumption in large clusters. [ABSTRACT FROM PUBLISHER]

Details

Language :
English
ISSN :
10459219
Volume :
28
Issue :
4
Database :
Academic Search Index
Journal :
IEEE Transactions on Parallel & Distributed Systems
Publication Type :
Academic Journal
Accession number :
121854152
Full Text :
https://doi.org/10.1109/TPDS.2016.2613989