Back to Search Start Over

Cooper: Expedite Batch Data Dissemination in Computer Clusters with Coded Gossips.

Authors :
Liu, Yan
Niu, Di
Khabbazian, Majid
Source :
IEEE Transactions on Parallel & Distributed Systems; Aug2017, Vol. 28 Issue 8, p2204-2217, 14p
Publication Year :
2017

Abstract

Data transfers happen frequently in server clusters for software and application deployment, and in parallel computing clusters to transmit intermediate results in batches among servers between computation stages. This paper presents Cooper, an optimized prototype system to speedup multi-batch data transfers among a cluster of servers, leveraging a theoretically proven optimal algorithm called “coded permutation gossip,” which employs a simple random topology control scheme to best utilize bandwidth and decentralized random linear network coding to maximize the useful information transmitted. On a process-level coding-transfer pipeline, we investigate the best block division, batch division and inter-batch scheduling strategies to minimize the broadcast finish time in a realistic setting. For batch-based transfers, we propose a scheduling algorithm with low overhead that overlaps the transfers of consecutive batches and temporarily prioritizes later batches, to further reduce the broadcast finish time. We describe an asynchronous and distributed implementation of Cooper and have deployed it on Amazon EC2 for evaluation. Based on results from real experiments, we show that Cooper can almost double the speed of data transfers in computing clusters, as compared to state-of-the-art content distribution tools like BitTorrent, at a low CPU overhead. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
10459219
Volume :
28
Issue :
8
Database :
Complementary Index
Journal :
IEEE Transactions on Parallel & Distributed Systems
Publication Type :
Academic Journal
Accession number :
124148103
Full Text :
https://doi.org/10.1109/TPDS.2017.2654242