Back to Search Start Over

G2P: A Partitioning Approach for Processing DBSCAN with MapReduce

Authors :
Javam C. Machado
José Antônio Fernandes de Macêdo
Victor A. E. de Farias
Ticiana L. Coelho da Silva
Antonio Cavalcante Araujo Neto
Source :
Web and Wireless Geographical Information Systems ISBN: 9783319182506, W2GIS
Publication Year :
2015
Publisher :
Springer International Publishing, 2015.

Abstract

One of the most important aspects to consider when computing large data sets is to distribute and parallelize the analysis algorithms. A distributed system presents a good performance if the workload is properly balanced. It is expected that the computing time is directly related to the processing time on the node where the processing takes longer. This paper aims at proposing a data partitioning strategy that takes into account partition balance and that is generic for spatial data. Our proposed solution is based on a grid model data structure that is further transformed into a graph partitioning problem, where we finally compute the partitions. Our proposed approach is used on the distributed DBSCAN algorithm and it is focused on finding density areas in a large data set using MapReduce. We call our approach G2P (Grid and Graph Partitioning) and we show via massive experiments that G2P presents great quality data partitioning for the distributed DBSCAN algorithm compared to the competitors. We believe that G2P is not only suitable for DBSCAN algorithm, but also to execute spatial join operations and distance based range queries to name to a few.

Details

ISBN :
978-3-319-18250-6
ISBNs :
9783319182506
Database :
OpenAIRE
Journal :
Web and Wireless Geographical Information Systems ISBN: 9783319182506, W2GIS
Accession number :
edsair.doi...........78e33a27029acc56bc88bf9f3ccc0e8f