Back to Search Start Over

K-Means and C4.5 Decision Tree Based Prediction of Long-Term Precipitation Variability in the Poyang Lake Basin, China

Authors :
Dan Lou
Mengxi Yang
Dawei Shi
Guojie Wang
Waheed Ullah
Yuanfang Chai
Yutian Chen
Source :
Atmosphere, Vol 12, Iss 7, p 834 (2021)
Publication Year :
2021
Publisher :
MDPI AG, 2021.

Abstract

The machine learning algorithms application in atmospheric sciences along the Earth System Models has the potential of improving prediction, forecast, and reconstruction of missing data. In the current study, a combination of two machine learning techniques namely K-means, and decision tree (C4.5) algorithms, are used to separate observed precipitation into clusters and classified the associated large-scale circulation indices. Observed precipitation from the Chinese Meteorological Agency (CMA) during 1961–2016 for 83 stations in the Poyang Lake basin (PLB) is used. The results from K-Means clusters show two precipitation clusters splitting the PLB precipitation into a northern and southern cluster, with a silhouette coefficient ~0.5. The PLB precipitation leading cluster (C1) contains 48 stations accounting for 58% of the regional station density, while Cluster 2 (C2) covers 35, accounting for 42% of the stations. The interannual variability in precipitation exhibited significant differences for both clusters. The decision tree (C4.5) is employed to explore the large-scale atmospheric indices from National Climate Center (NCC) associated with each cluster during the preceding spring season as a predictor. The C1 precipitation was linked with the location and intensity of subtropical ridgeline position over Northern Africa, whereas the C2 precipitation was suggested to be associated with the Atlantic-European Polar Vortex Area Index. The precipitation anomalies further validated the results of both algorithms. The findings are in accordance with previous studies conducted globally and hence recommend the applications of machine learning techniques in atmospheric science on a sub-regional and sub-seasonal scale. Future studies should explore the dynamics of the K-Means, and C4.5 derived indicators for a better assessment on a regional scale. This research based on machine learning methods may bring a new solution to climate forecast.

Details

Language :
English
ISSN :
20734433
Volume :
12
Issue :
7
Database :
Directory of Open Access Journals
Journal :
Atmosphere
Publication Type :
Academic Journal
Accession number :
edsdoj.bc51ab0f6cd54d089afb27fa1538a9c7
Document Type :
article
Full Text :
https://doi.org/10.3390/atmos12070834