Skip to content

Commit e691d5c

Browse files
committed
Clean up existing calibration documentation
1 parent 7a12e46 commit e691d5c

File tree

1 file changed

+130
-67
lines changed

1 file changed

+130
-67
lines changed

docs/src/calibration.md

Lines changed: 130 additions & 67 deletions
Original file line numberDiff line numberDiff line change
@@ -2,47 +2,75 @@
22

33
## Introduction
44

5-
In general, models attempt to reproduce real world observations.
6-
Calibration is the process of finding the parameter set that will best reproduce real world observation.
5+
In general, models attempt to reproduce real world observations. Calibration is
6+
the process of finding the parameter set that will best reproduce real world
7+
observation.
78

89
The key ingredients in calibration are:
910
- the model we want to run;
1011
- the parameters we want to tune along with their prior distributions;
11-
- the observational data we want to reproduce and how that data is represented in the model;
12+
- the observational data we want to reproduce and how that data is represented
13+
in the model;
1214
- the noise associated to such observational data.
1315

14-
The process of calibrating consists of optimizing how different parameters match the given observations within the given noise. In ClimaLand, we use `EnsembleKalmanProcesses.jl` (EKP) to perform automatic calibration.
15-
Before discussing the details of EKP, we introduce the following terminology to mirror the key ingredients introduced above:
16+
The process of calibrating consists of optimizing how different parameters match
17+
the given observations within the given noise. In ClimaLand, we use
18+
`EnsembleKalmanProcesses.jl` (EKP) to perform automatic calibration. Before
19+
discussing the details of EKP, we introduce the following terminology to mirror
20+
the key ingredients introduced above:
1621

1722
- a `forward_model` that runs the model for a given parameter set,
18-
- an `observation` vector that contains the data or statistics of data we want the model to reproduce,
19-
- an `observation_map` which maps the equivalent of the observation vector, but from the model output,
20-
- `priors` which contains the parameters we want to calibrate (find the value that makes the model match the observation best), priors gives which parameters, but also their distribution
21-
- a covariance matrix, that defines the observational error and correlations
22-
23-
[EnsembleKalmanProcesses.jl](https://github.com/CliMA/EnsembleKalmanProcesses.jl) (EKP) is at the center of CliMA's calibration efforts. EKP implements a suite of Ensemble Kalman methods to find a (locally) optimal parameter set `U` for a model `G` to fit noisy `Γ` observational data `Y`. These methods are optimized for problems where the model `G` is computationally expensive and no analtyic derivatives are available, as in the case of weather forcasting, where Ensemble Kalman techniques have a long history of success.
24-
25-
Large calibration campaigns often require supercomputers and while direct use of EKP.jl is possible, CliMA's preferred approach is using [ClimaCalibrate.jl](https://github.com/CliMA/ClimaCalibrate.jl), a package optimized for running on compute clusters. `ClimaCalibrate` handles efficient job orchestration and abstracts the details of the underlying system, providing a simpler user experience. Consult the [ClimaCalibrate documentation](https://clima.github.io/ClimaCalibrate.jl/dev/) for further information.
23+
- an `observation` vector that contains the data or statistics of data we want
24+
the model to reproduce,
25+
- an `observation_map` which maps the equivalent of the observation vector, but
26+
from the model output,
27+
- `priors` which contains the parameters we want to calibrate (find the value
28+
that makes the model match the observation best), priors gives which
29+
parameters, but also their distribution
30+
- a covariance matrix that defines the observational error and correlations
31+
32+
[EnsembleKalmanProcesses.jl](https://github.com/CliMA/EnsembleKalmanProcesses.jl)
33+
(EKP) is at the center of CliMA's calibration efforts. EKP implements a suite of
34+
Ensemble Kalman methods to find a (locally) optimal parameter set `U` for a
35+
model `G` to fit noisy `Γ` observational data `Y`. These methods are optimized
36+
for problems where the model `G` is computationally expensive and no analtyic
37+
derivatives are available, as in the case of weather forcasting, where Ensemble
38+
Kalman techniques have a long history of success.
39+
40+
Large calibration campaigns often require supercomputers and while direct use of
41+
EKP.jl is possible, CliMA's preferred approach is using
42+
[ClimaCalibrate.jl](https://github.com/CliMA/ClimaCalibrate.jl), a package
43+
optimized for running on compute clusters. `ClimaCalibrate` handles efficient
44+
job orchestration and abstracts the details of the underlying system, providing
45+
a simpler user experience. Consult the
46+
[ClimaCalibrate documentation](https://clima.github.io/ClimaCalibrate.jl/dev/)
47+
for further information.
2648

2749
## Calibrate a land model
2850

29-
In this tutorial, we will perform a calibration using `ClimaCalibrate`. `ClimaCalibrate` provides an interface to `EnsembleKalmanProcesses.jl` that is more optimized for use on supercomputers. The [tutorial to calibrate a single site latent heat flux](https://clima.github.io/ClimaLand.jl/stable/generated/calibration/minimal_working_example_obs/) shows how to perform a calibration using `EKP` directly.
51+
In this tutorial, we will perform a calibration using `ClimaCalibrate`.
52+
`ClimaCalibrate` provides an interface to `EnsembleKalmanProcesses.jl` that is
53+
more optimized for use on supercomputers. The
54+
[tutorial to calibrate a single site latent heat flux](https://clima.github.io/ClimaLand.jl/stable/generated/calibration/minimal_working_example_obs/)
55+
shows how to perform a calibration using `EKP` directly.
3056

31-
The `calibrate` function is at the heart of performing a calibration with `ClimaCalibrate`:
57+
The `calibrate` function is at the heart of performing a calibration with
58+
`ClimaCalibrate`:
3259

3360
```julia
34-
import ClimaCalibrate as CAL
61+
import ClimaCalibrate
3562

36-
CAL.calibrate(
37-
CAL.WorkerBackend,
63+
ClimaCalibrate.calibrate(
64+
ClimaCalibrate.WorkerBackend,
3865
utki,
3966
n_iterations,
4067
prior,
41-
caldir,
68+
output_dir,
4269
)
4370
```
4471

45-
where the `utki` object defines your EKP configurations, for example, the default is:
72+
where the `utki` object defines your EKP configurations, for example, the
73+
default is:
4674

4775
```julia
4876
EKP.EnsembleKalmanProcess(
@@ -54,42 +82,62 @@ EKP.EnsembleKalmanProcess(
5482
)
5583
```
5684

57-
where `obs_series` is the "truth" you want to calibrate your model on. It can take many forms.
58-
For example, you may want to calibrate your land model latent heat flux (lhf), the
59-
observations could be monthly global average of lhf, or monthly average at 100 random locations on land, or the annual amplitude and phase...
60-
You will create `obs_series` from some data (for example ERA5), as a vector.
61-
62-
Note that `obs_series` object contains the covariance matrix of the noise, which informs the uncertainties in space and time of your targeted "truth". It can be set, for example, to the inter-annual variance of a variable, or to the average of the variable times a % (e.g., 5%), or to a flat noise (for example, 5 W m-2 for latent heat). This will inform the EKP algorithm that if the model is within the target +- noise at specific space and time, the goal is reached.
63-
64-
`TransformUnscented.Unscented` is a method in EKP, that requires `2 x number of parameters + 1` ensemble members (`ensemble_size`, the number of parameter set drawn for your prior distribution tested at each iteration). For more information, read the [EKP documentation for that method](https://clima.github.io/EnsembleKalmanProcesses.jl/dev/unscented_kalman_inversion/).
65-
66-
`verbose = true` is a setting that writes information about your calibration run to a log file.
67-
68-
`rng` is a set random seed.
69-
70-
`Scheduler` is a EKP setting for timestepping, please read [EKP schedulers documentations](https://clima.github.io/EnsembleKalmanProcesses.jl/dev/learning_rate_scheduler/).
71-
72-
`CAL.WorkerBackend` defines how to interact with the underlying compute system. For other possible backends (for example, `JuliaBackend`, `ClimaGPUBackend`, or `DerechoBackend`),
73-
see [this doc page](https://clima.github.io/ClimaCalibrate.jl/dev/backends/).
74-
75-
Each backend is optimized for specific use cases and computing resources. The backend system is implemented through Julia's multiple dispatch,
76-
so that code written for one computer can seamlessly be ported to a new/different environments.
77-
78-
`prior` is the distribution of the parameters you want to calibrate. For example, if you want to calibrate two parameters called `sc` and `pc`,
79-
you would define your priors like this, for example:
85+
where `obs_series` is the "truth" you want to calibrate your model on. It can
86+
take many forms. For example, you may want to calibrate monthly global average
87+
of latent heat flux, monthly average at 100 random locations on land, or the
88+
annual amplitude and phase. You will create `obs_series` from some data (for
89+
example ERA5), as a vector.
90+
91+
Note that `obs_series` object contains the covariance matrix of the noise, which
92+
informs the uncertainties in space and time of your targeted "truth". It can be
93+
set, for example, to the inter-annual variance of a variable, or a percentage
94+
(e.g., 5%) of the average of the variable, or to a flat noise (e.g., 5 W m-2 for
95+
latent heat). This will inform the EKP algorithm that if the model is within the
96+
target plus or minus the noise at specific space and time, the goal is reached.
97+
98+
- `TransformUnscented.Unscented` is a method in EKP, that requires `2p + 1`
99+
ensemble members for each iteration, where `p` is the number of parameters.
100+
For more information, read the
101+
[EKP documentation for that method](https://clima.github.io/EnsembleKalmanProcesses.jl/dev/unscented_kalman_inversion/).
102+
103+
- `verbose = true` is a setting that writes information about your calibration
104+
run to a log file.
105+
106+
- `rng` is a set random seed.
107+
108+
- `Scheduler` is a EKP setting for timestepping, please read
109+
[EKP schedulers documentation](https://clima.github.io/EnsembleKalmanProcesses.jl/dev/learning_rate_scheduler/).
110+
111+
- `ClimaCalibrate.WorkerBackend` defines how to interact with the underlying
112+
compute system. For other possible backends (for example, `JuliaBackend`,
113+
`ClimaGPUBackend`, or `DerechoBackend`), see the
114+
[backend documentation in ClimaCalibrate](https://clima.github.io/ClimaCalibrate.jl/dev/backends/).
115+
116+
Each backend is optimized for specific use cases and computing resources. The
117+
backend system is implemented through Julia's multiple dispatch, so that code
118+
written for one environment can seamlessly be ported to a new/different
119+
environments.
120+
121+
- `prior` is the distribution of the parameters you want to calibrate. For
122+
example, if you want to calibrate two parameters called `sc` and `pc`, you
123+
would define your priors like this, for example:
80124
```julia
81125
prior_sc = EKP.constrained_gaussian("sc", 5e-6, 5e-4, 0, Inf);
82126
prior_pc = EKP.constrained_gaussian("pc", -2e6, 1e6, -Inf, Inf);
83127
prior = EKP.combine_distributions([prior_sc, prior_pc]);
84128
```
85129
For more documentation about prior distribution, see [this EKP documentation page](https://clima.github.io/EnsembleKalmanProcesses.jl/dev/parameter_distributions/).
86130

87-
`n_iterations` is the number of times your priors distribution will be updated, at each iteration your model is run for the number of `ensemble_size`.
88-
So in total, your model will be run `ensemble_size` * `n_iterations`.
131+
- `n_iterations` is the number of times your priors distribution will be
132+
updated, at each iteration your model is run for the number of
133+
`ensemble_size`. So in total, your model will be run `ensemble_size` *
134+
`n_iterations`.
89135

90-
`caldir` is the path to your calibration output directory. For example `calibration_output`. Inside this folder, the parameter set of each iteration * member will
91-
be stored, as well as the output of your model simulations. For example, if you ran a calibration with 1 iteration and 2 members, caldir would be structured
92-
like this:
136+
- `output_dir` is the path to your calibration output directory. Inside this
137+
folder, the parameter set of each ensemble member for each iteration is
138+
stored, as well as the output of your model simulations. For example, if you
139+
ran a calibration with 1 iteration and 2 members, output_dir would be
140+
structured like this:
93141

94142
```
95143
.
@@ -136,21 +184,31 @@ like this:
136184
│ │ │ └── output_active -> output_0000
137185
│ │ └── parameters.toml
138186
```
139-
Each iteration contains folders for each member, inside which you can find the parameters value inside `parameters.toml`, and model outputs inside `global_diagnostics`.
140-
141-
Two additional functions need to be defined in order to run `CAL.calibrate`. `CAL.forward_model(iteration, member)` and `observation_map(iteration)`.
142-
The `CAL.forward_model(iteration, member)` needs to generate your model output for a specific iteration and member. The `observation_map(iteration)`
143-
needs to return your loss, a vector of the same format as `observations` but created with your model output (for example, monthly average of latent heat flux),
144-
for all members. To make this easier, it can be useful to implement a `process_member_data(root_path)` function that generates one member output from your
145-
model output path.
146-
147-
Once you have defined `CAL.forward_model`, `CAL.observation_map`, `caldir`, `noise`, `observations`, `n_iterations`, `ensemble_size`, and your backend, you can
148-
call `CAL.calibrate`!
149-
150-
## job script
151-
152-
A calibration job will likely take hours to complete, so you will probably have to submit a job with a job scheduler.
153-
Below is an example job .pbs script (for PBS, e.g., Derecho):
187+
Each iteration contains directories for each member, inside which you can find
188+
the parameters value inside `parameters.toml`, and model outputs inside
189+
`global_diagnostics`.
190+
191+
Two additional functions need to be defined in order to run
192+
`ClimaCalibrate.calibrate`. `ClimaCalibrate.forward_model(iteration, member)`
193+
and `ClimaCalibrate.observation_map(iteration)`. The
194+
`ClimaCalibrate.forward_model(iteration, member)` needs to generate your model
195+
output for a specific iteration and member. The
196+
`ClimaCalibrate.observation_map(iteration)` needs to return your loss, a vector
197+
of the same format as `observations` but created with your model output (for
198+
example, monthly average of latent heat flux), for all members. To make this
199+
easier, it can be useful to implement a `process_member_data(root_path)`
200+
function that generates one member output from your model output path.
201+
202+
Once you have defined `ClimaCalibrate.forward_model`,
203+
`ClimaCalibrate.observation_map`, `output_dir`, `noise`, `observations`,
204+
`n_iterations`, `ensemble_size`, and your backend, you can call
205+
`ClimaCalibrate.calibrate`!
206+
207+
## Job script
208+
209+
A calibration job will likely take hours to complete, so you will probably have
210+
to submit a job with a job scheduler. Below is an example job .pbs script (for
211+
PBS, e.g., Derecho):
154212

155213
```bash
156214
#!/bin/bash
@@ -195,11 +253,16 @@ julia --project=.buildkite -e 'using Pkg; Pkg.instantiate(;verbose=true)'
195253
julia --project=.buildkite/ experiments/calibration/run_calibration.jl
196254
```
197255

198-
where `calibrate_land.jl` is a script that generates all the arguments needed and eventually calls `CAL.calibrate`.
199-
On a Slurm cluster, comment out `add_workers` in `calibrate_land.jl`, as worker processes will inherit the allocated resources automatically.
200-
You would start the job with a command such as `qsub name_of_job_script` for PBS or `sbatch name_of_job_script` for Slurm, and a few hours later, you would get a calibrated parameter set. You can check the status of your job with `qstat -u username` of PBS or `squeue -u username` on Slurm.
256+
where `run_calibration.jl` is a script that set up the calibration and call
257+
`ClimaCalibrate.calibrate`. You would start the job with a command such as `qsub
258+
name_of_job_script` for PBS or `sbatch name_of_job_script` for Slurm, and a few
259+
hours later, you would get a calibrated parameter set. You can check the status
260+
of your job with `qstat -u username` of PBS or `squeue -u username` on Slurm.
201261

202-
Note that with the default EKP configuration, UTKI, the number of ensemble is set by the number of parameters, as explained in the documentation above. The number of workers (if you use the worker backend) is automatically set to that numbers, so that all members are run in parallel for each iteration.
262+
Note that with the default EKP configuration, UTKI, the number of ensemble is
263+
set by the number of parameters, as explained in the documentation above. The
264+
number of workers (if you use the worker backend) is automatically set to that
265+
numbers, so that all members are run in parallel for each iteration.
203266

204267
## Configure your land calibration
205268

0 commit comments

Comments
 (0)