Back to Search Start Over

Data‐Driven Equation Discovery of a Cloud Cover Parameterization.

Authors :
Grundner, Arthur
Beucler, Tom
Gentine, Pierre
Eyring, Veronika
Source :
Journal of Advances in Modeling Earth Systems; Mar2024, Vol. 16 Issue 3, p1-26, 26p
Publication Year :
2024

Abstract

A promising method for improving the representation of clouds in climate models, and hence climate projections, is to develop machine learning‐based parameterizations using output from global storm‐resolving models. While neural networks (NNs) can achieve state‐of‐the‐art performance within their training distribution, they can make unreliable predictions outside of it. Additionally, they often require post‐hoc tools for interpretation. To avoid these limitations, we combine symbolic regression, sequential feature selection, and physical constraints in a hierarchical modeling framework. This framework allows us to discover new equations diagnosing cloud cover from coarse‐grained variables of global storm‐resolving model simulations. These analytical equations are interpretable by construction and easily transferable to other grids or climate models. Our best equation balances performance and complexity, achieving a performance comparable to that of NNs (R2 = 0.94) while remaining simple (with only 11 trainable parameters). It reproduces cloud cover distributions more accurately than the Xu‐Randall scheme across all cloud regimes (Hellinger distances < 0.09), and matches NNs in condensate‐rich regimes. When applied and fine‐tuned to the ERA5 reanalysis, the equation exhibits superior transferability to new data compared to all other optimal cloud cover schemes. Our findings demonstrate the effectiveness of symbolic regression in discovering interpretable, physically‐consistent, and nonlinear equations to parameterize cloud cover. Plain Language Summary: In climate models, cloud cover is usually expressed as a function of coarse, pixelated variables. Traditionally, this functional relationship is derived from physical assumptions. In contrast, machine learning (ML) approaches, such as neural networks, sacrifice interpretability for performance. In our approach, we use high‐resolution climate model output to learn a hierarchy of cloud cover schemes from data. To bridge the gap between simple statistical methods and ML algorithms, we employ a symbolic regression method. Unlike classical regression, which requires providing a set of basis functions from which the equation is composed of, symbolic regression only requires mathematical operators (such as +, ×) that it learns to combine. By using a genetic algorithm, inspired by the process of natural selection, we discover an interpretable, nonlinear equation for cloud cover. This equation is simple, performs well, satisfies physical principles, and outperforms other algorithms when applied to new observationally‐informed data. Key Points: We systematically derive and evaluate cloud cover parameterizations of various complexity from global storm‐resolving simulation outputUsing symbolic regression combined with physical constraints, we find a new interpretable equation balancing performance and simplicityOur data‐driven cloud cover equation can be retuned with few samples, facilitating transfer learning to generalize to other realistic data [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
19422466
Volume :
16
Issue :
3
Database :
Complementary Index
Journal :
Journal of Advances in Modeling Earth Systems
Publication Type :
Academic Journal
Accession number :
176274942
Full Text :
https://doi.org/10.1029/2023MS003763