Back to Search Start Over

Improving MapReduce Performance with Partial Speculative Execution.

Authors :
Wang, Yaoguang
Lu, Weiming
Lou, Renjie
Wei, Baogang
Source :
Journal of Grid Computing; Dec2015, Vol. 13 Issue 4, p587-604, 18p
Publication Year :
2015

Abstract

The MapReduce framework has become the de facto standard for big data processing due to its attractive features and abilities. One is that it automatically parallelizes a job into multiple tasks and transparently handles task execution on a large cluster of commodity machines. The increasing heterogeneity of distributed environments may result in a few straggling tasks, which prolong job completion. Speculative execution is proposed to mitigate stragglers. However, the existing speculative execution mechanism could not work efficiently as many speculative tasks are still slower than their original tasks. In this paper, we explore an approach to increase the efficiency of speculative execution, and further improve MapReduce performance. We propose the Partial Speculative Execution (PSE) strategy to make speculative tasks start from the checkpoint. By leveraging the checkpoint of original tasks, PSE can eliminate the costs of re-reading, re-copying, and re-computing the processed data. We implement PSE in Hadoop, and evaluate its performance in terms of job completion time and the efficiency of speculative execution under several kinds of classical workloads. Experimental results show that, in heterogeneous environments with stragglers, PSE completes jobs 56 % faster than that with no speculation and 12 % faster than that with LATE, an improved speculative execution algorithm. In addition, on average PSE can improve the efficiency of speculative execution by 24 % compared to LATE. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
15707873
Volume :
13
Issue :
4
Database :
Complementary Index
Journal :
Journal of Grid Computing
Publication Type :
Academic Journal
Accession number :
112357122
Full Text :
https://doi.org/10.1007/s10723-015-9350-y