Back to Search Start Over

Overcoming set imbalance in data driven parameterization: A case study of gravity wave momentum transport

Authors :
Yang, L. Minah
Gerber, Edwin P.
Publication Year :
2024

Abstract

Machine learning for the parameterization of subgrid-scale processes in climate models has been widely researched and adopted in a few models. A key challenge in developing data-driven parameterization schemes is how to properly represent rare, but important events that occur in geoscience datasets. We investigate and develop strategies to reduce errors caused by insufficient sampling in the rare data regime, under constraints of no new data and no further expansion of model complexity. Resampling and importance weighting strategies are constructed with user defined parameters that systematically vary the sampling/weighting rates in a linear fashion and curb too much oversampling. Applying this new method to a case study of gravity wave momentum transport reveals that the resampling strategy can successfully improve errors in the rare regime at little to no loss in accuracy overall in the dataset. The success of the strategy, however, depends on the complexity of the model. More complex models can overfit the tails of the distribution when using non-optimal parameters of the resampling strategy.<br />Comment: 26 pages, 10 figures, 2 tables

Details

Database :
arXiv
Publication Type :
Report
Accession number :
edsarx.2402.18030
Document Type :
Working Paper