Back to Search Start Over

Predictive modelling benchmark of nitrate Vulnerable Zones at a regional scale based on Machine learning and remote sensing.

Authors :
Cardenas-Martinez, Aaron
Rodriguez-Galiano, Victor
Luque-Espinar, Juan Antonio
Mendes, Maria Paula
Source :
Journal of Hydrology. Dec2021:Part C, Vol. 603, pN.PAG-N.PAG. 1p.
Publication Year :
2021

Abstract

• Predictive modelling benchmark of Nitrate Vulnerable Zones using Random Forest. • Extrinsic driving forces to groundwater were selected as environmental predictors. • Phenological features derived from remote sensing were included as novel features. • Feature selection methods revealed good performance predicting nitrate pollution. • Phenology and manure production from livestock farms as most important features. Nitrate leaching losses from arable lands into groundwater were a main driver in designating Nitrate Vulnerable Zones (NVZs) according to the Nitrates Directive, with a view to enhancing their water quality. Despite this, developing common strategies for effective water quality control in these areas remains a challenge in the European Union. This paper evaluates the performance of the Random Forest (RF) machine learning algorithm combined with Feature Selection (FS) techniques in predicting nitrate pollution in NVZs groundwater bodies in different periods and using updated environmental features in Andalusia, Spain. A set of forty-four features extrinsic to groundwater bodies were used as environmental predictors, with an aim to make this methodology exportable to other regions. Phenological features obtained through remote-sensing techniques were included to measure the dynamics of agricultural activity. In addition, other dynamic features derived from weather and livestock effluents were included to analyse seasonal and interannual changes in nitrate pollution. Three feature stacks and two nitrate databases were used in the predictive modelling: Period 1 (2009), with 321 nitrate samples for training; Period 2 (2010), with 282 nitrate samples for validation and initial spatial prediction; and Period 3 (2017), to assess the changes in the probability of groundwater nitrate content exceeding 50 mg/L. Random Forest as a wrapper with four sequential search methods was considered: sequential backward selection (SBS), sequential forward selection (SFS), sequential forward floating selection (SFFS) and sequential backward floating selection (SBFS). From among all the Feature Selection methods applied, Random Forest with SFS had the best performance (overall accuracy = 0.891 and six predictor features) and linked the highest probability of nitrate pollution with three dynamic features: the Normalized Difference Vegetation Index (NDVI) base level, NDVI value for the end of the growing season and accumulated manure production of livestock farms; and three static features: slope, sediment depositional areas and valley depth. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
00221694
Volume :
603
Database :
Academic Search Index
Journal :
Journal of Hydrology
Publication Type :
Academic Journal
Accession number :
154011354
Full Text :
https://doi.org/10.1016/j.jhydrol.2021.127092