Back to Search Start Over

Using land-use machine learning models to estimate daily NO2 concentration variations in Taiwan.

Authors :
Wong, Pei-Yi
Su, Huey-Jen
Lee, Hsiao-Yun
Chen, Yu-Cheng
Hsiao, Ya-Ping
Huang, Jen-Wei
Teo, Tee-Ann
Wu, Chih-Da
Spengler, John D.
Source :
Journal of Cleaner Production. Oct2021, Vol. 317, pN.PAG-N.PAG. 1p.
Publication Year :
2021

Abstract

It is likely that exposure surrogates from monitoring stations with various limitations are not sufficient for epidemiological studies covering large areas. Moreover, the spatiotemporal resolution of air pollution modelling approaches must be improved in order to achieve more accurate estimates. If not, the exposure assessments will not be applicable in future health risk assessments. To deal with this challenge, this study featured Land-Use Regression (LUR) models that use machine learning to assess the spatial-temporal variability of Nitrogen Dioxide (NO 2). Daily average NO 2 data was collected from 70 fixed air quality monitoring stations, belonging to the Taiwanese EPA, on the main island of Taiwan. Around 0.41 million observations from 2000 to 2016 were used for the analysis. Several datasets were employed to determine spatial predictor variables, including the EPA environmental resources dataset, the meteorological dataset, the land-use inventory, the landmark dataset, the digital road network map, the digital terrain model, MODIS Normalized Difference Vegetation Index database, and the power plant distribution dataset. Regarding analyses, conventional LUR and Hybrid Kriging-LUR were performed first to identify important predictor variables. A Deep Neural Network, Random Forest, and XGBoost algorithms were then used to fit the prediction model based on the variables selected by the LUR models. Lastly, data splitting, 10-fold cross validation, external data verification, and seasonal-based and county-based validation methods were applied to verify the robustness of the developed models. The results demonstrated that the proposed conventional LUR and Hybrid Kriging-LUR models captured 65% and 78%, respectively, of NO 2 variation. When the XGBoost algorithm was further incorporated in LUR and hybrid-LUR, the explanatory power increased to 84% and 91%, respectively. The Hybrid Kriging-LUR with XGBoost algorithm outperformed all other integrated methods. This study demonstrates the value of combining Hybrid Kriging-LUR model and an XGBoost algorithm to estimate the spatial-temporal variability of NO 2 exposure. For practical application, the associations of specific land-use/land cover types selected in the final model can be applied in land-use management and in planning emission reduction strategies. [Display omitted] • Estimating long-term daily NO 2 concentration with machine learning models. • Land-use patterns were included in machine learning models by using land-use regression. • The most contributed predictors were identified by stepwise variable selection. • Explanatory power of daily NO 2 concentration was increased from 0.65 to 0.91. • XGboost outperformed RF and DNN machine learning algorithms. Capsule: The explanatory power of Hybrid Kriging-LUR coupled with XGBoost algorithm on daily NO 2 variations reached 91% and outperformed all other integrated methods. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
09596526
Volume :
317
Database :
Academic Search Index
Journal :
Journal of Cleaner Production
Publication Type :
Academic Journal
Accession number :
152292885
Full Text :
https://doi.org/10.1016/j.jclepro.2021.128411