This repo contains code and data for the paper draft "Physically Interpretable Emulation of a Moist Convecting Atmosphere with a Recurrent Neural Network" by Qiyu Song and Zhiming Kuang. Following the steps below, one should be able to reproduce all results included in the paper.
We use The System for Atmospheric Modeling (SAM) model to generate our data. The source code of the model with our modifications is in SAM_v6.11.7/SRC_noisywave/. We ran several groups of experiments with different configurations.
For the 3 experiments to identify a linear model, we use an ensemble size of 1024. User should use prm.spinup_1024 as spinup configuration and these prm files 1024_4_1, 1024_4_2, 1024_4_3 as experimental configurations, and modify SAM_v6.11.7/SRC_noisywave/domain.f90 as:
integer, parameter :: nx_gl = 1024 ! Number of grid points in X
integer, parameter :: ny_gl = 1024 ! Number of grid points in Y
integer, parameter :: nsubdomains_x = 32 ! No of subdomains in x
integer, parameter :: nsubdomains_y = 32 ! No of subdomains in yFor the majority of experiments, we use an ensemble size of 256. Users can refer to prm files prm.spinup_256_0, prm.spinup_256_1 for spinup and SAM_v6.11.7/RCE_randmultsine/prm.run_256_msinefx* for experimental configurations. The domain setup should be:
integer, parameter :: nx_gl = 512 ! Number of grid points in X
integer, parameter :: ny_gl = 512 ! Number of grid points in Y
integer, parameter :: nsubdomains_x = 16 ! No of subdomains in x
integer, parameter :: nsubdomains_y = 16 ! No of subdomains in yFor all these and following SAM experiments, first compile the source code for an executable file and use the resub.ens file in the case directory to submit a job to a cluster (current version only reflect the setup on Harvard Cannon cluster).
First, go to SAM_v6.11.7/RCE_noisywave/ and run the spinup experiment. Then use run_batch_noisywave.sh to generate case folders for different wavenumbers. We used two different values (1 and 2) in line 5 of that file, therefore having 2 experiments for each wavenumber differed by initial random seeds. Submit all experiments using submit_exps.sh.
After running the experiments, convert .stat files to .nc files using stat2nc, which can be compiled in SAM_v6.11.7/UTIL/. Then use RNN_train_test/extract_data.ipynb to extract the data for training.
cd linear_model_paper
sbatch identification.run_4xThe identified linear model should be in linear_model_paper/model/. Load the model and write the variables A,B,C,K,NoisePattern in the sys variable to a .txt file, which will be used in next steps.
(optional) For a benchmark comparison, a linear response model without memory can be calculated as in linear_model_paper/get_linear_response_matrix.m, which will write the linear response matrix to another .txt file.
We perform a 2-stage training for the model. In each stage, first train for h0 and then for all parameters. Make modifications to model script (line 184-187) and training script (line 23-36) accordingly. Then submit the training by
cd RNN_train_test
sbatch train_ultimaternn_includewave_addp_lightningThe learned model will be a checkpoint file with the lowest validation error. The authors mostly used this model for following analysis.
Train a model without memory: model script, training script
Train a model with only the random forcing dataset: use the intermediate model during normal training
Train a model with only the coupled-wave forcing dataset: model script, training script, model
This part is included in several notebooks:
- offline tests: RNN, model without memory
- online tests: RNN, model without memory, fitting initial hidden state for RNN
This page is last updated on May 28, 2025. For any questions regarding the code or data, please contact Qiyu Song (qsong@g.harvard.edu).