1. OSMWatchman: Learning How to Detect Vandalized Contributions in OSM Using a Random Forest Classifier
- Author
-
Quy Thy Truong, Guillaume Touya, Cyril de Runz, Laboratoire des Sciences et Technologies de l'Information Géographique (LaSTIG), École nationale des sciences géographiques (ENSG), Institut National de l'Information Géographique et Forestière [IGN] (IGN)-Institut National de l'Information Géographique et Forestière [IGN] (IGN), Laboratoire Instrumentation, Simulation et Informatique Scientifique (IFSTTAR/COSYS/LISIS), Institut Français des Sciences et Technologies des Transports, de l'Aménagement et des Réseaux (IFSTTAR)-Communauté Université Paris-Est, Centre de Recherche en Sciences et Technologies de l'Information et de la Communication - EA 3804 (CRESTIC), Université de Reims Champagne-Ardenne (URCA), Bases de données et traitement des langues naturelles (BDTLN), Laboratoire d'Informatique Fondamentale et Appliquée de Tours (LIFAT), Centre National de la Recherche Scientifique (CNRS)-Université de Tours-Institut National des Sciences Appliquées - Centre Val de Loire (INSA CVL), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Centre National de la Recherche Scientifique (CNRS)-Université de Tours-Institut National des Sciences Appliquées - Centre Val de Loire (INSA CVL), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA), Université de Tours (UT)-Institut National des Sciences Appliquées - Centre Val de Loire (INSA CVL), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Centre National de la Recherche Scientifique (CNRS)-Université de Tours (UT)-Institut National des Sciences Appliquées - Centre Val de Loire (INSA CVL), and Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Centre National de la Recherche Scientifique (CNRS)
- Subjects
vandalism ,quality ,volunteered geographic information ,lcsh:G1-922 ,[INFO]Computer Science [cs] ,supervised machine learning ,OpenStreetMap ,VGI ,lcsh:Geography (General) ,ComputingMilieux_MISCELLANEOUS ,random forest - Abstract
Though Volunteered Geographic Information (VGI) has the advantage of providing free open spatial data, it is prone to vandalism, which may heavily decrease the quality of these data. Therefore, detecting vandalism in VGI may constitute a first way of assessing the data in order to improve their quality. This article explores the ability of supervised machine learning approaches to detect vandalism in OpenStreetMap (OSM) in an automated way. For this purpose, our work includes the construction of a corpus of vandalism data, given that no OSM vandalism corpus is available so far. Then, we investigate the ability of random forest methods to detect vandalism on the created corpus. Experimental results show that random forest classifiers perform well in detecting vandalism in the same geographical regions that were used for training the model and has more issues with vandalism detection in &ldquo, unfamiliar regions&rdquo
- Published
- 2020
- Full Text
- View/download PDF