Back to Search
Start Over
High Performance Machine Learning Models of Large Scale Air Pollution Data in Urban Area
- Source :
- Cybernetics and Information Technologies, Vol 20, Iss 6, Pp 49-60 (2020)
- Publication Year :
- 2020
- Publisher :
- Sciendo, 2020.
-
Abstract
- Preserving the air quality in urban areas is crucial for the health of the population as well as for the environment. The availability of large volumes of measurement data on the concentrations of air pollutants enables their analysis and modelling to establish trends and dependencies in order to forecast and prevent future pollution. This study proposes a new approach for modelling air pollutants data using the powerful machine learning method Random Forest (RF) and Auto-Regressive Integrated Moving Average (ARIMA) methodology. Initially, a RF model of the pollutant is built and analysed in relation to the meteorological variables. This model is then corrected through subsequent modelling of its residuals using the univariate ARIMA. The approach is demonstrated for hourly data on seven air pollutants (O3, NOx, NO, NO2, CO, SO2, PM10) in the town of Dimitrovgrad, Bulgaria over 9 years and 3 months. Six meteorological and three time variables are used as predictors. High-performance models are obtained explaining the data with R2 = 90%-98%.
Details
- Language :
- English
- ISSN :
- 13144081
- Volume :
- 20
- Issue :
- 6
- Database :
- Directory of Open Access Journals
- Journal :
- Cybernetics and Information Technologies
- Publication Type :
- Academic Journal
- Accession number :
- edsdoj.68cda8a927e1472aaa1a47e2b3102b4d
- Document Type :
- article
- Full Text :
- https://doi.org/10.2478/cait-2020-0060