Back to Search
Start Over
Placing big graph into cloud for parallel processing with a two-phase community-aware approach
- Source :
- Future Generation Computer Systems. 101:1187-1200
- Publication Year :
- 2019
- Publisher :
- Elsevier BV, 2019.
-
Abstract
- Big graphs are so large that their analysis often rely on the cloud for parallel processing. Data placement, as a key pre-processing step, has a profound impact on the performance of parallel processing. Traditional placement methods fail to preserve graph topologies, leading to poor performance. As the community is the most common structure of big graphs, in this work, we present a two-phase community-aware placement algorithm to place big graphs into the cloud for parallel processing. It can obtain a placement scheme that preserves the community structure well by maximizing the modularity density of the scheme under memory capacity constraints of computational nodes of the cloud in two phases. In the first phase, we design a streaming partitioning heuristic to detect communities based on partial and incomplete graph information. They form an initial placement scheme with relatively high modularity density. To improve it further, in the second phase, we put forward a scale-constrained kernel k-means algorithm. It takes as input the initial placement scheme and iteratively redistributes graph vertices across computational nodes under scale constraints until the modularity density cannot be improved any further. Finally, experiments show that our algorithm can preserve graph topologies well and greatly support parallel processing of big graphs in the cloud.
- Subjects :
- Theoretical computer science
Computer Networks and Communications
business.industry
Heuristic
Computer science
Big graph
Community structure
020206 networking & telecommunications
Cloud computing
02 engineering and technology
Network topology
Graph
Vertex (geometry)
Hardware and Architecture
0202 electrical engineering, electronic engineering, information engineering
020201 artificial intelligence & image processing
business
Software
Subjects
Details
- ISSN :
- 0167739X
- Volume :
- 101
- Database :
- OpenAIRE
- Journal :
- Future Generation Computer Systems
- Accession number :
- edsair.doi...........472e64fa921a4fb18d995ef086a39d30
- Full Text :
- https://doi.org/10.1016/j.future.2019.07.014