GNN Post-Processing for Precipitation Forecasting

Paper Reference:
GRAPH NEURAL NETWORKS FOR ENHANCING ENSEMBLE FORECASTS OF EXTREME RAINFALL
Christopher Bülte, Sohir Maskey, Philipp Scholl, Jonas von Berg, Gitta Kutyniok
Published as a workshop paper at “Tackling Climate Change with Machine Learning”, ICLR 2025

This repository implements a graph-based post-processing framework for ensemble precipitation forecasts. The approach leverages graph neural networks to explicitly model spatial dependencies and to improve the prediction of extreme rainfall events.
This code started as a fork from gnn-post-processing.

Introduction

In our workshop paper “Graph Neural Networks for Enhancing Ensemble Forecasts of Extreme Rainfall,” we present a novel graph neural network (GNN) approach to post-process ensemble forecasts of precipitation. Our focus is on capturing both:

Zero-precipitation events (via a discrete point mass), and
Heavy-tail extremes (via a generalized Pareto distribution).

This addresses challenges arising from climate-change-driven extreme rain events, which standard ensemble forecasts (e.g., from ECMWF) often fail to capture. By building station-based graphs and learning spatial dependencies, we demonstrate improved calibration and skill for heavy rainfall across multiple lead times.

Key Features:

Mixture distribution combining a discrete mass at zero with a tail-modeled GPD.
DeepSets to handle ensemble permutations.
GINE (Graph Isomorphism Network with Edge features) to embed station-level data and distances.
CRPS training for a fully probabilistic forecast distribution.

If you're interested in robust, accurate precipitation forecasting with neural networks – especially in the context of extreme events – this code can serve as a reference or baseline for further research.

We show that this method better captures rare, high-rainfall events than standard ensemble post-processing methods – a crucial improvement under climate-change-driven extremes.

This figure highlights how our GNN approach assigns higher probabilities to heavy-rain scenarios compared to conventional ensemble forecasts.

Installation

Clone this repository:

git clone https://github.com/username/gnn-postprocessing.git
cd gnn-postprocessing

Install conda.

Create environment.

conda env create -f environment.yml
conda activate gnn-env

Environment Setup

The environment.yml file defines a conda environment named gnn-env with:

- Python 3.9

- PyTorch 2.0.1 (compatible with CUDA 11.8)

- PyTorch Geometric (2.3.1+)

- Other libraries: NumPy, pandas, scikit-learn, geopy, etc.

Install and activate:

conda env create -f environment.yml
conda activate gnn-env

Single-Run Training

You can do a single run (for example, 24-hour lead time, mixed config) by specifying:

python train.py \
--leadtime 24h \
--dir trained_models/24h_mixed_u \
--run_id 0

What happens:

train.py looks for params.json in trained_models/24h_mixed_u/ (make sure it exists).
Creates logs in trained_models/24h_mixed_u/train_0.log.
Saves the best checkpoint to trained_models/24h_mixed_u/models/run_0-best.ckpt.

Single-Run Evaluation

Similarly, evaluate with:

python eval.py \
--leadtime 24h \
--folder trained_models/24h_mixed_u \
--data f

What happens:

eval.py looks for params.json in trained_models/24h_mixed_u/.
Finds .ckpt files in trained_models/24h_mixed_u/models/.
Averages predictions across them (if multiple exist).
Logs CRPS & saves eval_f.log, plus a CSV in trained_models/24h_mixed_u/f_results.csv (or f_results.txt summary).

Running All Experiments via Bash

You can execute multiple runs incorporating

leadtimes: 24h, 72h, 120h

configs: normal, normal_mixed, mixed, mixed

For this run:

chmod +x scripts/run_train.sh
./scripts/run_train.sh

To evaluate:

chmod +x scripts/run_eval.sh
./scripts/run_eval.sh

Where to Find Results

Each training subdirectory, e.g. trained_models/24h_mixed_u/, will contain:

params.json: The hyperparameters.

train_<run_id>.log: The training log.

models/: The best checkpoint file(s).

eval_<data>.log: The evaluation log.

<data>_results.csv: CSV with final predictions.

<data>.txt: CRPS summary.

License

If you find this repository helpful in your work, please consider citing:

@inproceedings{buelte2025gnn,
title={Graph Neural Networks for Enhancing Ensemble Forecasts of Extreme Rainfall},
author={B{\"u}lte, Christopher and Maskey, Sohir and Scholl, Philipp and von Berg, Jonas and Kutyniok, Gitta},
booktitle={Tackling Climate Change with Machine Learning (ICLR Workshop)},
year={2025}
}

Happy Forecasting!

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
imgs		imgs
models		models
scripts		scripts
trained_models		trained_models
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
eval.py		eval.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GNN Post-Processing for Precipitation Forecasting

Table of Contents

Introduction

Installation

Environment Setup

Single-Run Training

Single-Run Evaluation

Running All Experiments via Bash

Where to Find Results

License

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

SohirMaskey/raincast-gnn

Folders and files

Latest commit

History

Repository files navigation

GNN Post-Processing for Precipitation Forecasting

Table of Contents

Introduction

Installation

Environment Setup

Single-Run Training

Single-Run Evaluation

Running All Experiments via Bash

Where to Find Results

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages