Back to Search
Start Over
Coded Computing for Distributed Graph Analytics
- Source :
- ISIT
- Publication Year :
- 2018
- Publisher :
- arXiv, 2018.
-
Abstract
- Performance of distributed graph processing systems significantly suffers from 'communication bottleneck' as a large number of messages are exchanged among servers at each step of the computation. Motivated by graph based MapReduce, we propose a coded computing framework that leverages computation redundancy to alleviate the communication bottleneck in distributed graph processing. We develop a novel 'coding' scheme that systematically injects structured redundancy in computation phase to enable 'coded' multicasting opportunities during message exchange between servers, reducing communication load substantially in large-scale graph processing. For theoretical analysis, we consider random graph models, and prove that our proposed scheme enables an (asymptotically) inverse-linear trade-off between 'computation load' and 'average communication load' for two popular random graph models -- Erdos-Renyi model, and power law model. Particularly, for a given computation load r, (i.e. when each graph vertex is carefully stored at r servers), the proposed scheme slashes the average communication load by (nearly) a multiplicative factor of r. For the Erdos-Renyi model, our proposed scheme is optimal asymptotically as the graph size increases by providing an information-theoretic converse. To illustrate the benefits of our scheme in practice, we implement PageRank over Amazon EC2, using artificial as well as real-world datasets, demonstrating significant gains over conventional PageRank. We also specialize our scheme and extend our theoretical results to two other random graph models -- random bi-partite model, and stochastic block model. They asymptotically enable inverse-linear trade-offs between computation and communication loads in distributed graph processing for these popular random graph models as well. We complement the achievability results with converse bounds for both of these models.<br />Comment: Accepted for publication in the IEEE Transactions on Information Theory
- Subjects :
- Vertex (graph theory)
FOS: Computer and information sciences
Theoretical computer science
Computer science
Computer Science - Information Theory
Computation
02 engineering and technology
Library and Information Sciences
Bottleneck
law.invention
PageRank
Stochastic block model
law
020204 information systems
Server
0202 electrical engineering, electronic engineering, information engineering
Connectivity
Random graph
Graph analytics
Numerical analysis
Information Theory (cs.IT)
020206 networking & telecommunications
Graph
Computer Science Applications
Vertex (geometry)
Computer Science - Distributed, Parallel, and Cluster Computing
Graph (abstract data type)
Distributed, Parallel, and Cluster Computing (cs.DC)
Information Systems
MathematicsofComputing_DISCRETEMATHEMATICS
Subjects
Details
- Database :
- OpenAIRE
- Journal :
- ISIT
- Accession number :
- edsair.doi.dedup.....44f74eca3e14f32f33ddd09f726236ef
- Full Text :
- https://doi.org/10.48550/arxiv.1801.05522