1. A Parallel Algorithm for Tracking Dynamic Communities based on Apache Flink
- Author
-
Georgios Kechagias, George Paliouras, Dimitrios Vogiatzis, and Grigorios Tzortzis
- Subjects
Data processing ,Jaccard index ,Social network ,Computer science ,business.industry ,Parallel Processing ,Parallel algorithm ,Apache Flink ,02 engineering and technology ,computer.software_genre ,Execution time ,Community Tracking ,Rendering (computer graphics) ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Social media ,Data mining ,Undirected graph ,business ,computer ,Social Network Analysis - Abstract
Summarization: Real world social networks are highly dynamic environments consisting of numerous users and communities, rendering the tracking of their evolution a challenging problem. In this work, we propose a parallel algorithm for tracking dynamic communities between consecutive timeframes of the social network, where communities are represented as undirected graphs. Our method compares the communities based on the widely adopted Jaccard similarity measure and is implemented on top of Apache Flink, a novel framework for parallel and distributed data processing. We evaluate the benefits, in terms of execution time, that parallel processing brings to community tracking on datasets carrying different quantitative characteristics, derived from two popular social media platforms; Twitter and Mathematics Stack Exchange Q&A. Experiments show that our parallel method has the ability to calculate the similarity of communities within seconds, even for large social networks, consisting of more than 600 communities per timeframe. Παρουσιάστηκε στο: 10th Hellenic Conference on Artificial Intelligence
- Published
- 2018
- Full Text
- View/download PDF