Back to Search Start Over

Spatial bagging to integrate spatial correlation into ensemble machine learning.

Authors :
Özbayrak, Fehmi
Foster, John T.
Pyrcz, Michael J.
Source :
Computers & Geosciences. Apr2024, Vol. 186, pN.PAG-N.PAG. 1p.
Publication Year :
2024

Abstract

We propose a novel spatial bagging workflow for predictive ensemble machine learning that improves on standard bagging models. Our proposed method integrates spatial bootstrap for bagging with the number of effective sample size, n e f f , for integration of the spatial context of the dataset. We benchmark the improved performance over standard machine learning bagging models with a large number of two-dimensional synthetic datasets with varying degrees of Gaussian noise. For noise free datasets, both methods demonstrate equivalent accuracy; however, spatial bagging achieves this with a significantly smaller sample size, showcasing its improved efficiency. As data noise increases, spatial bagging consistently outperforms standard bagging, displaying an improved Mean Squared Error (MSE) and robustness against overfitting. Our proposed spatial bagging method computes the optimal effective sample size for spatial data, reducing model overfitting. Furthermore, our proposed method requires only the additional step of variogram calculation and modeling, and can be implemented with any predictive machine learning bagging model with minimal code modification. i.e., specification of the number of bootstrap samples as the number of effective data. We recommend using spatial bagging for improved predictions for any spatial data setting across diverse scientific fields, e.g., atmospheric, agricultural, subsurface resources etc. • Ensemble learning methods employ standard bagging, which is prone to overfitting. • We introduce a novel spatial bagging technique utilizing effective sample size. • Spatial bagging mitigates overfitting, enhancing robustness against noise. • Spatial bagging provides a computational advantage by using smaller data. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
00983004
Volume :
186
Database :
Academic Search Index
Journal :
Computers & Geosciences
Publication Type :
Academic Journal
Accession number :
176297035
Full Text :
https://doi.org/10.1016/j.cageo.2024.105558