1. HGraph: I/O-Efficient Distributed and Iterative Graph Computing by Hybrid Pushing/Pulling.
- Author
-
Wang, Zhigang, Gu, Yu, Bao, Yubin, Yu, Ge, Yu, Jeffrey Xu, and Wei, Zhiqiang
- Subjects
GRAPH algorithms ,SCALABILITY ,FAULT tolerance (Engineering) ,TASK analysis ,BIG data - Abstract
In the big data era, distributed computation is becoming a preferred solution for iterative graph analysis. However, graphs are rapidly growing in size and more importantly, there exist a lot of messages across iterations. For better scalability, many distributed systems keep graph data and message data on disk. Now these systems solely employ either pushing or pulling mode to manage data, but neither can always work well during the entire computation. This is mainly because I/O access patterns are dynamic and complex. This article proposes a hybrid solution. It achieves the optimal performance in different scenarios by dynamically and adaptively switching modes between pushing and pulling. Specifically, we first devise a new block-centric pulling technique. It pulls messages much more I/O-efficiently than the existing vertex-centric pulling mode. We then combine pushing and pulling. For general-purpose, we categorize graph algorithms and accordingly present two seamless switching frameworks. We also design performance prediction components specialized to the two frameworks, to decide how and when we can switch modes. Some optimization strategies are also given to further enhance performance, such as priority scheduling and lightweight fault-tolerance. Extensive experiments against state-of-the-art solutions confirm the effectiveness of our proposals. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF