Skip to content

Commit 375512c

Browse files
committed
Add reanalysis-driven single-column experiment using ERA5 data
This PR introduces a new simulation setup where PROSPCT is driven by time-varying ERA5 reanalysis in a single-column model. It includes automatic data preprocessing via ClimaArtifacts, updated documentation for running GCM and ERA5 cases, and a test suite validating forcing generation.
1 parent 46f9b72 commit 375512c

24 files changed

+1227
-24
lines changed

.buildkite/pipeline.yml

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -836,6 +836,15 @@ steps:
836836
slurm_mem: 20GB
837837
soft_fail: true
838838

839+
- label: ":genie: Prognostic EDMFX ERA5 Time Varying driven in a column"
840+
command: >
841+
julia --color=yes --project=.buildkite .buildkite/ci_driver.jl
842+
--config_file $CONFIG_PATH/prognostic_edmfx_tv_era5driven_column.yml
843+
--job_id prognostic_edmfx_tv_era5driven_column
844+
artifact_paths: "prognostic_edmfx_tv_era5driven_column/output_active/*"
845+
agents:
846+
slurm_mem: 12GB
847+
839848
- label: ":genie: Prognostic EDMFX Bomex in a box"
840849
command: >
841850
julia --color=yes --project=.buildkite .buildkite/ci_driver.jl

Artifacts.toml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,12 @@ git-tree-sha1 = "235a16e02c24a28b20244c0f2e7f149cc783166f"
77
[era5_cloud]
88
git-tree-sha1 = "8bad412ada94be95a2e11c794b499c49d746be50"
99

10+
[era5_hourly_atmos_raw]
11+
git-tree-sha1 = "8234def2ead82e385a330a48ed2f0c030e434065"
12+
13+
[era5_hourly_atmos_processed]
14+
git-tree-sha1 = "a1a465e8d237d78bef1e6d346054da395787a9f9"
15+
1016
[cfsite_gcm_forcing]
1117
git-tree-sha1 = "2f196bcac8950f8cb79ead41821f423d092b8951"
1218
lazy = true

NEWS.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,9 @@ ClimaAtmos.jl Release Notes
44
main
55
-------
66

7+
### Add support for reanalysis-driven single column model with time-varying forcing
8+
PR [#3758](https://github.com/CliMA/ClimaAtmos.jl/pull/3758) adds support for driving single-column model (SCM) simulations with time-varying ERA5 reanalysis data. This extends the existing GCM-driven SCM interface to allow site-specific simulations that resolve the diurnal cycle and are suited for calibration against observations. Users can now run reanalysis-driven cases globally using only a date and lat/lon, thanks to integrated data handling via ClimaArtifacts.jl. See the updated “Single Column Model” docs page for details on setup, variable requirements, and how to prepare ERA5 input files.
9+
710
### Remove `dt_save_to_sol`
811

912
The option to save the solution to the integrator object (`dt_save_to_sol`) was

config/default_configs/default_config.yml

Lines changed: 10 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -179,13 +179,13 @@ disable_surface_flux_tendency:
179179
help: "(Bool) Whether to disable surface flux tendencies of momentum, energy, and tracers [`true`, `false` (default)]. When this flag is true, the surface flux tendency is not applied, no matter how surface conditions are computed."
180180
value: false
181181
surface_setup:
182-
help: "Surface flux scheme [`DefaultExchangeCoefficients` (default), `DefaultMoninObukhov`]"
182+
help: "Surface flux scheme [`DefaultExchangeCoefficients` (default), `DefaultMoninObukhov`, `GCM`, `Reanalysis`, `ReanalysisTimeVarying`]"
183183
value: "DefaultExchangeCoefficients"
184184
surface_temperature:
185-
help: "Prescribed surface temperature functional form ['ZonallySymmetric' (default), 'ZonallyAsymmetric', 'RCEMIPII']"
185+
help: "Prescribed surface temperature functional form ['ZonallySymmetric' (default), 'ZonallyAsymmetric', 'RCEMIPII', `Reanalysis`, `ReanalysisTimeVarying`]"
186186
value: "ZonallySymmetric"
187187
initial_condition:
188-
help: "Initial condition [`DryBaroclinicWave`, `MoistBaroclinicWave`, `ConstantBuoyancyFrequencyProfile`, `DecayingProfile`, `IsothermalProfile`, `Bomex`, `DryDensityCurrentProfile`, `RisingThermalBubbleProfile`, `ISDAC`], or a file path for a NetCDF file (read documentation about requirements)."
188+
help: "Initial condition [`DryBaroclinicWave`, `MoistBaroclinicWave`, `ConstantBuoyancyFrequencyProfile`, `DecayingProfile`, `IsothermalProfile`, `Bomex`, `DryDensityCurrentProfile`, `RisingThermalBubbleProfile`, `ISDAC`, `GCM`, `Reanalysis`, `ReanalysisTimeVarying`], or a file path for a NetCDF file (read documentation about requirements)."
189189
value: "DecayingProfile"
190190
perturb_initstate:
191191
help: "Add a perturbation to the initial condition [`false`, `true` (default)]"
@@ -236,14 +236,20 @@ ls_adv:
236236
help: "Large-scale advection [`nothing` (default), `Bomex`, `LifeCycleTan2018`, `Rico`, `ARM_SGP`, `GATE_III`]"
237237
value: ~
238238
external_forcing:
239-
help: "External forcing for single column experiments [`nothing` (default), `GCM`]"
239+
help: "External forcing for single column experiments [`nothing` (default), `GCM`, `Reanalysis`, `ReanalysisTimeVarying`]"
240240
value: ~
241241
external_forcing_file:
242242
help: "External forcing file containing large-scale forcings, initial conditions, and boundary conditions. Used for GCM-driven SCM and ISDAC setup [`nothing` (default), `path/to/file`]"
243243
value: ~
244244
cfsite_number:
245245
help: "cfsite identifier for single column forcing from `external_forcing_file`, specified as siteN. For site details see Shen et al. 2022 `https://doi.org/10.1029/2021MS002631`. [`site23` (default), `siteN`]"
246246
value: "site23"
247+
site_latitude:
248+
help: "Site latitude for single column model. Used for externally driven time varying forcing model to generate the forcing file. Artifact support is currently for eastern Pacific region in July 2007 only. [`17.0` (default)]"
249+
value: 17.0
250+
site_longitude:
251+
help: "Site longitude for single column model. Used for externally driven time varying forcing model to generate the forcing file. Artifact support is currently for eastern Pacific region in July 2007 only. [`-149.0` (default)]"
252+
value: -149.0
247253
subsidence:
248254
help: "Subsidence [`nothing` (default), `Bomex`, `LifeCycleTan2018`, `Rico`, `DYCOMS`, `ISDAC`]"
249255
value: ~
Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
initial_condition: "ReanalysisTimeVarying"
2+
external_forcing: "ReanalysisTimeVarying"
3+
surface_setup: "ReanalysisTimeVarying"
4+
surface_temperature: "ReanalysisTimeVarying"
5+
start_date: "20070701"
6+
site_latitude: 17.0
7+
site_longitude: -149.0
8+
turbconv: "prognostic_edmfx"
9+
implicit_diffusion: true
10+
implicit_sgs_advection: false
11+
approximate_linear_solve_iters: 2
12+
edmfx_upwinding: first_order
13+
edmfx_sgsflux_upwinding: first_order
14+
tracer_upwinding: vanleer_limiter
15+
energy_upwinding: vanleer_limiter
16+
rayleigh_sponge: true
17+
edmfx_entr_model: "PiGroups"
18+
edmfx_detr_model: "PiGroups"
19+
edmfx_sgs_mass_flux: true
20+
edmfx_sgs_diffusive_flux: true
21+
edmfx_nh_pressure: true
22+
edmfx_filter: true
23+
prognostic_tke: true
24+
precip_model: "0M"
25+
prescribe_ozone: false
26+
moist: "equil"
27+
config: "column"
28+
z_max: 45e3
29+
truncation: 40000
30+
z_elem: 200
31+
z_stretch: true
32+
dz_bottom: 30
33+
perturb_initstate: false
34+
dt: "10secs"
35+
dt_rad: "10mins"
36+
t_end: "8hours"
37+
cloud_model: "quadrature_sgs"
38+
call_cloud_diagnostics_per_stage: true
39+
toml: [toml/prognostic_edmfx_calibrated.toml]
40+
netcdf_output_at_levels: true
41+
netcdf_interpolation_num_points: [2, 2, 60]
42+
output_default_diagnostics: false
43+
rad: allskywithclear
44+
insolation: "externaldriventv"
45+
diagnostics:
46+
- short_name: [ts, ta, thetaa, ha, pfull, rhoa, ua, va, wa, hur, hus, cl, clw, cli, hussfc, evspsbl, pr]
47+
period: 10mins
48+
- short_name: [arup, waup, taup, thetaaup, haup, husup, hurup, clwup, cliup, waen, taen, thetaaen, haen, husen, huren, clwen, clien, tke]
49+
period: 10mins
50+
- short_name: [entr, detr, lmix, bgrad, strain, edt, evu]
51+
period: 10mins
52+
- short_name: [rlut, rlutcs, rsut, rsutcs, clwvi, lwp, clivi, dsevi, clvi, prw, hurvi, husv]
53+
period: 10mins
54+
- reduction_time: max
55+
short_name: tke
56+
period: 10mins

docs/make.jl

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,7 @@ makedocs(;
3636
"Topography Representation" => "topography.md",
3737
"Tracers" => "tracers.md",
3838
"Radiative Equilibrium" => "radiative_equilibrium.md",
39+
"Single Column Model" => "single_column_prospect.md",
3940
"Restarts and checkpoints" => "restarts.md",
4041
"REPL scripts" => "repl_scripts.md",
4142
"Configuration" => "config.md",

docs/src/single_column_prospect.md

Lines changed: 61 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,61 @@
1+
# Single Column Models
2+
`ClimaAtmos.jl` supports several canonical test cases that are run in a single column model designed to verify how PROSPECT (EDMF) performs against other test cases. These cases include variants of `bomex`, `dycoms`, `rico`, `soares`, and `trmm` and can be found in the `config/model_configs` directory. To run, for example, the BOMEX test case execute the following:
3+
```bash
4+
julia --project=examples examples/hybrid/driver.jl --config_file config/model_configs/prognostic_edmfx_bomex_column.yml --job_id bomex
5+
```
6+
It may also be helpful to run in interactive mode to be able to examine the simulation object, debug, and develop the code further. To enter debug mode run `julia --project=examples` and then in the REPL run:
7+
```julia
8+
using Revise
9+
import ClimaAtmos as CA
10+
11+
# get the configuration arguments
12+
simulation = CA.AtmosSimulation("config/model_configs/prognostic_edmfx_bomex_column.yml")
13+
sol_res = CA.solve_atmos!(simulation) # run the simulation
14+
```
15+
16+
## Externally-Driven Single Column Models
17+
Currently two versions of the externally driven single column model, `GCM` driven and `ReanalysisTimeVarying` are supported in `ClimaAtmos.jl`. They have been developed specifically for the purpose of realistic simulation and model calibration. Externally-driven means that the model is initialized and forced with data coming from a different simulation. This differs from setups like, for example, BOMEX or SOARES which have steady forcing or functional forcing, respectively.
18+
19+
### GCM-Driven Case
20+
For the `GCM` driven case we can run the experiment using the config file `config/model_configs/prognostic_edmfx_gcmdriven_column.yml` by running:
21+
```bash
22+
julia --project=examples examples/hybrid/driver.jl --config_file config/model_configs/prognostic_edmfx_gcmdriven_column.yml --job_id gcm_driven_scm
23+
```
24+
In the config the following settings are particularly important:
25+
```YAML
26+
initial_condition: "GCM"
27+
external_forcing: "GCM"
28+
external_forcing_file: artifact"cfsite_gcm_forcing"/HadGEM2-A_amip.2004-2008.07.nc
29+
cfsite_number : "site23"
30+
surface_setup: "GCM"
31+
```
32+
Here we must set all of `initial_condition`, `external_forcing` and `surface_setup` to be `GCM` as each component requires information from the external file. The `external_forcing_file` and `cfsite_number` together determine the temperature, specific humidity, and wind as well as horizontal and vertical advection profiles that drive the simulation, and can be set to a local file path as opposed to using the artifact. Radiation and surface temperature are also specified. Here the forcing file, an example of which is stored in the artifact, contains groups for each `cfsite` to drive the simulation. See [Shen et al. 2022](https://agupubs.onlinelibrary.wiley.com/doi/full/10.1029/2021MS002631) for more information.
33+
34+
### Reanalysis-Driven Case
35+
The `ReanalysisTimeVarying` case extends the `GCM` driven case by providing support for single-column simulations which resolve the diurnal cycle, can be run at any site globally, and use reanalysis to drive the simulation, allowing for calibration of EDMF to earth-system observations in the single-column setting. This feature was found to be needed to address biases in calibration arising from correlation between time-of-day and cloud liquid water path over the tropical Pacific. For this simulation we again highlight similar arguments in the config file:
36+
```YAML
37+
initial_condition: "ReanalysisTimeVarying"
38+
external_forcing: "ReanalysisTimeVarying"
39+
surface_setup: "ReanalysisTimeVarying"
40+
surface_temperature: "ReanalysisTimeVarying"
41+
start_date: "20070701"
42+
site_latitude: 17.0
43+
site_longitude: -149.0
44+
```
45+
By this point, the first 4 entries are intuitive. We need to dispatch over each of these methods to setup the forcing for each component of the model. To obtain the observations, now note that instead of directly specifying a file we must specify a `start_date`, `site_latitude`, and `site_longitude`. This is because we now use `ClimaArtifacts.jl` to store data to ensure reproducibility of our simulation and results. The data is generated by downloading from ECMWF and further documentation for ERA5 data download can be found either directly on the ECMWF page and `ClimaArtifacts.jl`. Note that the profiles, surface temperature, and surface fluxes cannot be obtained from a single request and so together we need 3 files for all the data. We include a script at `src/utils/era5_observations_to_forcing_file.jl` which extracts the profiles and computes the tendencies needed for the simulation from the raw ERA5 reanalysis files. We store the observations directly into an artifact `era5_hourly_atmos_processed` to eliminate the need to reprocess specific sites and locations. This setup means that users are free to choose sites globally at any time at which ERA5 data is available. Unfortunately, global hourly renanalysis is too large to store in an artifact and so we have currently only provided support for the first 5 days of July 2007 in the tropical Pacific, stored in `era5_hourly_atmos_raw`, only available on the `clima` and Caltech `HPC` servers.
46+
#### Running the Reanalysis-driven case at different times and locations
47+
You need 3 separate files with specific variables and naming convention for the data processing script to work.
48+
1. Hourly profiles with variables, following ERA5 naming convention, including `t`, `q`, `u`, `v`, `w`, `z`, `clwc`, `ciwc`. This file should be stored in the appropriate artifacts directory with the following naming scheme `"forcing_and_cloud_hourly_profiles_$(start_date).nc"` where `start_date` should specify the date data starts on formatted YYYYMMDD. We require `clwc` and `ciwc` profiles because these are typical targets for calibration but are not needed to run the simulation directly.
49+
2. Instantaneous variables, including surface temperature `ts` which should be stored in `"hourly_inst_$(start_date).nc"`
50+
3. Accumulated variables, including surface sensible and latent heat fluxes, `hfls` and `hfss`, which should be stored in `"hourly_accum_$(start_date).nc"`. These need to be divided by the appropriate time resolution, which for hourly data is 3600 and for daily and monthly data is 86400 (not a typo see [here](https://confluence.ecmwf.int/display/CKB/ERA5%3A+data+documentation#ERA5:datadocumentation-Monthlymeans)).
51+
52+
##### On HPC/Clima
53+
To run locations already in the artifact, e.g., sites in the tropical Pacific in the first 5 days of July 2007 the config file will work out of the box. To run other locations or times please follow the steps for local.
54+
55+
##### On local
56+
To run the simulation on a local machine you will need to first download the reanalysis data from ECMWF, ensuring that you have all the required variables. This will be stored in 3 separate files which should be all placed in the same directory. The user must use their `Overrides.toml` in their `.julia/artifacts` path to make the `era5_hourly_atmos_raw` artifact point to the folder where the data is stored. For the raw data and location for processed files you'll need to specify the path where the data is stored and where to store the files as follows:
57+
```bash
58+
8234def2ead82e385a330a48ed2f0c030e434065 = "/some/random/path/raw_data_dir" # for raw data
59+
a1a465e8d237d78bef1e6d346054da395787a9f9 = "/some/random/path/processed_files" # for storing
60+
```
61+
Good luck! :wink:

post_processing/ci_plots.jl

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1367,6 +1367,7 @@ EDMFBoxPlots = Union{
13671367
Val{:prognostic_edmfx_trmm_column_0M},
13681368
Val{:prognostic_edmfx_simpleplume_column},
13691369
Val{:prognostic_edmfx_gcmdriven_column},
1370+
Val{:prognostic_edmfx_tv_era5driven_column},
13701371
Val{:prognostic_edmfx_bomex_box},
13711372
Val{:rcemipii_box_diagnostic_edmfx},
13721373
Val{:prognostic_edmfx_soares_column},

src/ClimaAtmos.jl

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,7 @@ include(joinpath("utils", "utilities.jl"))
2121
include(joinpath("utils", "debug_utils.jl"))
2222
include(joinpath("utils", "variable_manipulations.jl"))
2323
include(joinpath("utils", "read_gcm_driven_scm_data.jl"))
24+
include(joinpath("utils", "era5_observations_to_forcing_file.jl"))
2425

2526
include(joinpath("utils", "AtmosArtifacts.jl"))
2627
import .AtmosArtifacts as AA

src/cache/cache.jl

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -140,7 +140,7 @@ function build_cache(
140140
unit_basis_vector_data.(CT3, sfc_local_geometry)
141141
),
142142
)
143-
143+
external_forcing = external_forcing_cache(Y, atmos, params, start_date)
144144
sfc_setup = surface_setup(params)
145145
scratch = temporary_quantities(Y, atmos)
146146

@@ -154,6 +154,7 @@ function build_cache(
154154
scratch,
155155
dt,
156156
conservation_check,
157+
external_forcing,
157158
)
158159

159160
# Coupler compatibility
@@ -174,7 +175,6 @@ function build_cache(
174175
) : ()
175176

176177
hyperdiff = hyperdiffusion_cache(Y, atmos)
177-
external_forcing = external_forcing_cache(Y, atmos, params)
178178
non_orographic_gravity_wave = non_orographic_gravity_wave_cache(Y, atmos)
179179
orographic_gravity_wave = orographic_gravity_wave_cache(Y, atmos)
180180
radiation = radiation_model_cache(Y, atmos, radiation_args...)

0 commit comments

Comments
 (0)