Back to Search Start Over

High Performance Machine Learning Models of Large Scale Air Pollution Data in Urban Area

Authors :
Gocheva-Ilieva Snezhana G.
Ivanov Atanas V.
Livieris Ioannis E.
Source :
Cybernetics and Information Technologies, Vol 20, Iss 6, Pp 49-60 (2020)
Publication Year :
2020
Publisher :
Sciendo, 2020.

Abstract

Preserving the air quality in urban areas is crucial for the health of the population as well as for the environment. The availability of large volumes of measurement data on the concentrations of air pollutants enables their analysis and modelling to establish trends and dependencies in order to forecast and prevent future pollution. This study proposes a new approach for modelling air pollutants data using the powerful machine learning method Random Forest (RF) and Auto-Regressive Integrated Moving Average (ARIMA) methodology. Initially, a RF model of the pollutant is built and analysed in relation to the meteorological variables. This model is then corrected through subsequent modelling of its residuals using the univariate ARIMA. The approach is demonstrated for hourly data on seven air pollutants (O3, NOx, NO, NO2, CO, SO2, PM10) in the town of Dimitrovgrad, Bulgaria over 9 years and 3 months. Six meteorological and three time variables are used as predictors. High-performance models are obtained explaining the data with R2 = 90%-98%.

Details

Language :
English
ISSN :
13144081
Volume :
20
Issue :
6
Database :
Directory of Open Access Journals
Journal :
Cybernetics and Information Technologies
Publication Type :
Academic Journal
Accession number :
edsdoj.68cda8a927e1472aaa1a47e2b3102b4d
Document Type :
article
Full Text :
https://doi.org/10.2478/cait-2020-0060