Back to Search
Start Over
Extending MapReduce across Clouds with BStream
- Source :
- IEEE Transactions on Cloud Computing. 2:362-376
- Publication Year :
- 2014
- Publisher :
- Institute of Electrical and Electronics Engineers (IEEE), 2014.
-
Abstract
- Today, batch processing frameworks like Hadoop MapReduce are difficult to scale to multiple clouds due to latencies involved in inter-cloud data transfer and synchronization overheads during shuffle-phase. This inhibits the MapReduce framework from guaranteeing performance at variable load surges without over-provisioning in the internal cloud (IC). We propose BStream, a cloud bursting framework for MapReduce that couples stream-processing in the external cloud (EC) with Hadoop in the internal cloud (IC). Stream processing in EC enables pipelined uploading, processing and downloading of data to minimize network latencies. We use this framework to meet job deadlines. BStream uses an analytical model to minimize the usage of EC. We propose different checkpointing strategies that overlap output transfer with input transfer/processing and simultaneously reduce the computation involved in merging the results from EC and IC. Checkpointing further reduces job completion time. We experimentally compare BStream with other related works and illustrate performance benefits due to stream processing and checkpointing strategies in EC. Lastly, we characterize the operational regime of BStream. � 2013 IEEE.
- Subjects :
- Computer Networks and Communications
business.industry
Computer science
Distributed computing
Inter clouds
Map-reduce
Stream processing
Cloud computing
Parallel computing
Computer Science Applications
Upload
Hardware and Architecture
Transfer (computing)
Synchronization (computer science)
Batch processing
business
Software
Information Systems
Subjects
Details
- ISSN :
- 21687161
- Volume :
- 2
- Database :
- OpenAIRE
- Journal :
- IEEE Transactions on Cloud Computing
- Accession number :
- edsair.doi.dedup.....79acc98eb890c90011dc97030e90edbe