Back to Search
Start Over
G2P: A Partitioning Approach for Processing DBSCAN with MapReduce
- Source :
- Web and Wireless Geographical Information Systems ISBN: 9783319182506, W2GIS
- Publication Year :
- 2015
- Publisher :
- Springer International Publishing, 2015.
-
Abstract
- One of the most important aspects to consider when computing large data sets is to distribute and parallelize the analysis algorithms. A distributed system presents a good performance if the workload is properly balanced. It is expected that the computing time is directly related to the processing time on the node where the processing takes longer. This paper aims at proposing a data partitioning strategy that takes into account partition balance and that is generic for spatial data. Our proposed solution is based on a grid model data structure that is further transformed into a graph partitioning problem, where we finally compute the partitions. Our proposed approach is used on the distributed DBSCAN algorithm and it is focused on finding density areas in a large data set using MapReduce. We call our approach G2P (Grid and Graph Partitioning) and we show via massive experiments that G2P presents great quality data partitioning for the distributed DBSCAN algorithm compared to the competitors. We believe that G2P is not only suitable for DBSCAN algorithm, but also to execute spatial join operations and distance based range queries to name to a few.
Details
- ISBN :
- 978-3-319-18250-6
- ISBNs :
- 9783319182506
- Database :
- OpenAIRE
- Journal :
- Web and Wireless Geographical Information Systems ISBN: 9783319182506, W2GIS
- Accession number :
- edsair.doi...........78e33a27029acc56bc88bf9f3ccc0e8f