Paper Reference:
Uncertainty quantification for data-driven weather models
Christopher Bülte, Nina Horat, Julian Quinting, Sebastian Lerch
Artificial Intelligence for the Earth Systems (2025).
This repository provides code for producing ensemble forecasts using the PanguWeather model and different initial condition methods, as described in the paper. Furthermore, two post-processing approaches are implemented and can be used. In addition, code for evaluation of the different approaches is included.
If you have questions regarding this work or want to collaborate, feel free to reach out.
Modern weather forecasting relies heavily on physics-based numerical weather prediction (NWP) models that simulate atmospheric processes using partial differential equations. These models are typically run as ensembles with varying initial conditions and perturbed physics to provide probabilistic forecasts. However, the high computational cost limits both resolution and ensemble size.
Recently, data-driven machine learning (ML) models have shown great promise as an alternative, offering faster, lower-cost forecasts without requiring explicit physical equations. Leading models such as FourCastNet, Pangu-Weather, and GraphCast now rival or surpass traditional NWP systems in deterministic forecast skill.
Despite these advances, most ML-based weather models remain deterministic, lacking the ability to quantify uncertainty—a critical component for risk-aware decision-making. This work explores practical approaches to generate probabilistic forecasts from such deterministic, data-driven models.
We investigate two main categories of uncertainty quantification (UQ) methods:
- Initial Condition-based (IC-UQ): Run multiple simulations with perturbed initial states.
- Post-hoc (PH-UQ): Add uncertainty using statistical or ML techniques applied after the forecast.
These approaches are evaluated using the Pangu-Weather model over a 5-year period in Europe, with ECMWF’s operational ensemble serving as a benchmark. The following figure shows a schematic illustration of the different uncertainty quantification approaches to generate probabilistic forecasts from deterministic data-driven weather models.
We utilize the following initial condition-based approaches:
Gaussian noise perturbations IFS initial conditions Random field perturbationsWe utilize the following post-hoc approaches:
- EasyUQ
- Distributional regression networks (DRNs)
We analyze the following meteorological variables: T2M, T850, U10, V10, Z500 over an unseen evaluation period and the European domain. The following figure shows the mean continously ranked probability score (CRPS) as a function of the forecast lead time for the different UQ methods, aggregated over all locations.
For more details check out the paper.
The code is described in the corresponding folders:
| Folder | Description |
|---|---|
DRN |
Implementation of the DRN model. |
EasyUQ |
Implementation of the EasyUQ model. |
evaluation |
Evaluation of the different forecasts. |
Pangu-Weather |
Creating perturbed forecasts with the Pangu-Weather model. |
plots |
Generating the results plots. |
utils |
Utility functions. |
wb2 |
Code for using the WeatherBench2 dataset. |
We created a conda environment ready to use for running the Pangu-Weather model. In order to use it create a conda environment using
conda env create -f Pangu-Weather/Pangu_GPU.yml. Further instructions can be found in the conda documentation.
For installing and accessing WeatherBench2, please checkout their corresponding repository.
If you find this repository helpful in your work, please consider citing:
@article{buelte24,
title = {Uncertainty quantification for data-driven weather models},
author = {Bülte, Christopher and Horat, Nina and Quinting, Julian and Lerch, Sebastian},
journal = {Artificial Intelligence for the Earth Systems},
year = {2025},
doi = {10.1175/AIES-D-24-0049.1},
url = {https://journals.ametsoc.org/view/journals/aies/aop/AIES-D-24-0049.1/AIES-D-24-0049.1.xml},
}



