Back to Search Start Over

Placing big graph into cloud for parallel processing with a two-phase community-aware approach

Authors :
Kekun Hu
Guosun Zeng
Source :
Future Generation Computer Systems. 101:1187-1200
Publication Year :
2019
Publisher :
Elsevier BV, 2019.

Abstract

Big graphs are so large that their analysis often rely on the cloud for parallel processing. Data placement, as a key pre-processing step, has a profound impact on the performance of parallel processing. Traditional placement methods fail to preserve graph topologies, leading to poor performance. As the community is the most common structure of big graphs, in this work, we present a two-phase community-aware placement algorithm to place big graphs into the cloud for parallel processing. It can obtain a placement scheme that preserves the community structure well by maximizing the modularity density of the scheme under memory capacity constraints of computational nodes of the cloud in two phases. In the first phase, we design a streaming partitioning heuristic to detect communities based on partial and incomplete graph information. They form an initial placement scheme with relatively high modularity density. To improve it further, in the second phase, we put forward a scale-constrained kernel k-means algorithm. It takes as input the initial placement scheme and iteratively redistributes graph vertices across computational nodes under scale constraints until the modularity density cannot be improved any further. Finally, experiments show that our algorithm can preserve graph topologies well and greatly support parallel processing of big graphs in the cloud.

Details

ISSN :
0167739X
Volume :
101
Database :
OpenAIRE
Journal :
Future Generation Computer Systems
Accession number :
edsair.doi...........472e64fa921a4fb18d995ef086a39d30
Full Text :
https://doi.org/10.1016/j.future.2019.07.014