Skip to content

Commit d190d47

Browse files
authored
Merge pull request #313 from ESMValGroup/update_episode_cmorize
Updating CMORization episode
2 parents 37235a4 + d9be077 commit d190d47

File tree

1 file changed

+59
-24
lines changed

1 file changed

+59
-24
lines changed

_episodes/09-cmorization.md

Lines changed: 59 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
title: "CMORization: adding new datasets to ESMValTool"
33
teaching: 15
44
exercises: 45
5-
compatibility: ESMValTool v2.6.0
5+
compatibility: ESMValTool v2.11.0
66

77
questions:
88
- "CMORization: what is it and why do we need it?"
@@ -28,6 +28,9 @@ that follow the CMOR standards. Unfortunately, not all datasets follow these
2828
standards. In order to use such datasets in ESMValTool we first need to reformat
2929
the data. This process is called "CMORization".
3030

31+
More detailed informations can be found in the
32+
[Documentation](https://docs.esmvaltool.org/en/latest/develop/dataset.html).
33+
3134
> ## What are the CMOR standards?
3235
>
3336
> The name "CMOR" originates from a tool: [the Climate Model Output
@@ -123,6 +126,12 @@ run the CMORizer scripts:
123126
esmvaltool data format --config_file <path to config-user.yml> <dataset-name>
124127
```
125128

129+
The options `--start` and `--end` can be added to command above to restrict the
130+
formatting of raw data to a time range. They will be ignored if a specific
131+
dataset does not support this option (i.e. because all the data is provided as a single file).
132+
Valid formats are `YYYY`, `YYYYMM`, `YYYYMMDD`. The same way is also applicable for
133+
the option `esmvaltool data download`.
134+
126135
The ``config-user.yml`` is the file in which we define the different data
127136
paths, see the episode on [Configuration]({{ page.root }}{% link _episodes/03-configuration.md %}).
128137
In the ``rootpath`` of your ``config-user.yml``, make sure to add the right
@@ -141,38 +150,52 @@ name that was created to store the raw observation data files, i.e.
141150
If everything is okay, the output should look something like this:
142151
143152
~~~
144-
...
145-
... Starting the CMORization Tool at time: 2022-07-26 14:02:16 UTC
153+
... Writing program log files to:
154+
/scratch/b/username/esmvaltool_output/data_formatting_20240527_132448/run/main_log.txt
155+
/scratch/b/username/esmvaltool_output/data_formatting_20240527_132448/run/main_log_debug.txt
156+
... Starting the CMORization Tool at time: 2024-05-27 13:24:48 UTC
146157
... ----------------------------------------------------------------------
147-
... input_dir = /home/peter/data/RAWOBS
148-
... output_dir = /home/peter/esmvaltool_output/data_formatting_20220726_140216
158+
... input_dir = /work/bd0854/DATA/ESMValTool2/RAWOBS
159+
... output_dir = /scratch/b/username/esmvaltool_output/data_formatting_20240527_132448
149160
... ----------------------------------------------------------------------
150161
... Running the CMORization scripts.
151162
... Processing datasets ['FLUXCOM']
152-
... Input data from: /home/peter/data/RAWOBS/Tier3/FLUXCOM
153-
... Output will be written to: /home/peter/esmvaltool_output/
154-
data_formatting_20220726_140216/Tier3/FLUXCOM
155-
... Reformat script: /home/peter/mambaforge/envs/esmvaltool/lib/python3.9/
156-
site-packages/esmvaltool/cmorizers/data/formatters/datasets/fluxcom
157-
... CMORizing dataset FLUXCOM using Python script /home/peter/mambaforge/envs/
158-
esmvaltool/lib/python3.9/site-packages/esmvaltool/cmorizers/data/formatters/
159-
datasets/fluxcom.py
160-
... Found input file '/home/peter/data/RAWOBS/Tier3/FLUXCOM/GPP.ANN.CRUNCEPv6.monthly.*.nc'
163+
... Input data from: /work/bd0854/DATA/ESMValTool2/RAWOBS/Tier3/FLUXCOM
164+
... Output will be written to: /scratch/b/username/esmvaltool_output/data_formatting_20240527_132448
165+
/Tier3/FLUXCOM
166+
... Reformat script: /home/b/username/ESMValTool/ESMValTool/esmvaltool/cmorizers/data/formatters/
167+
datasets/fluxcom
168+
... CMORizing dataset FLUXCOM using Python script /home/b/username/ESMValTool/ESMValTool/esmvaltool/
169+
cmorizers/data/formatters/datasets/fluxcom.py
170+
... Found input file '/work/bd0854/DATA/ESMValTool2/RAWOBS/Tier3/FLUXCOM/GPP.ANN.CRUNCEPv6.monthly.
171+
*.nc'
161172
... CMORizing variable 'gpp'
162173
... Lmon
163174
... Var is gpp
164-
... ... UserWarning: Ignoring netCDF variable 'GPP' invalid units 'gC m-2 day-1'
175+
... WARNING /work/bd0854/username/utils/mambaforge/envs/esmvaltool/lib/python3.11/site-packages/
176+
iris/fileformats/_nc_load_rules/helpers.py:913: _WarnComboIgnoringCfLoad: Ignoring invalid u
177+
nits 'gC m-2 day-1' on netCDF variable 'GPP'.
178+
warnings.warn(
165179
166180
... Fixing time...
167181
... Fixing latitude...
168182
... Fixing longitude...
169183
... Flipping dimensional coordinate latitude...
170184
... Saving file
171-
... Saving: /home/peter/esmvaltool_output/data_formatting_20220726_140216/Tier3/
172-
FLUXCOM/OBS_FLUXCOM_reanaly_ANN-v1_Lmon_gpp_200001-200012.nc
185+
... Saving: /scratch/b/username/esmvaltool_output/data_formatting_20240527_132448/Tier3/FLUXCOM/
186+
OBS_FLUXCOM_reanaly_ANN-v1_Lmon_gpp_198001-198012.nc
173187
... Cube has lazy data [lazy is preferred]
188+
... WARNING /work/bd0854/username/utils/mambaforge/envs/esmvaltool/lib/python3.11/site-packages/
189+
iris/fileformats/netcdf/saver.py:2670: IrisDeprecation: Saving to netcdf with legacy-style a
190+
ttribute handling for backwards compatibility.
191+
This mode is deprecated since Iris 3.8, and will eventually be removed.
192+
Please consider enabling the new split-attributes handling mode, by setting 'iris.FUTURE.
193+
save_split_attrs = True'.
194+
warn_deprecated(message)
195+
174196
... CMORization of dataset FLUXCOM finished!
175197
... Formatting successful for dataset FLUXCOM
198+
176199
~~~
177200
{: .output}
178201
@@ -193,6 +216,12 @@ You can also see the path where ESMValTool stores the reformatting script:
193216
have a look at this file if you want. The script also uses a configuration file:
194217
`~/ESMValTool/esmvaltool/cmorizers/data/cmor_config/FLUXCOM.yml`.
195218

219+
To get help on CMORizer commands, run the tool with:
220+
221+
```bash
222+
esmvaltool data --help
223+
```
224+
196225
## Make a test recipe
197226

198227
To verify that the data is correctly CMORized, we will make a simple test
@@ -617,17 +646,23 @@ If we now run the test recipe on our newly 'CMORized' data,
617646
esmvaltool run recipe_check_fluxcom.yml --config_file <path to config-user.yml> --log_level debug
618647
```
619648

620-
it should be able to find the correct file, but it does not succeed yet. The first
621-
thing that the ESMValTool CMOR checker brings up is:
649+
it should be able to find the correct file, but it does not succeed yet. The ESMValTool CMOR checker
650+
brings up is:
622651

623652
~~~
624-
iris.exceptions.UnitConversionError: Cannot convert from unknown units. The
625-
"units" attribute may be set directly.
653+
esmvalcore.cmor.check.CMORCheckError: There were errors in variable GPP:
654+
GPP: units should be kg m-2 s-1, not unknown
655+
lon: standard_name should be longitude, not None
656+
lat: standard_name should be latitude, not None
657+
lon: units should be degrees_east, not unknown
658+
lon: has values < valid_min = 0.0
659+
lat: units should be degrees_north, not unknown
660+
GPP: does not match coordinate rank
626661
~~~
627662
{: .error}
628663

629-
If you look closely at the error messages, you can see that this error concerns
630-
the units of the coordinates. ESMValTool tries to fix them automatically,
664+
If you look closely at the error messages, you can see the reasons for these errors
665+
e.g. the units of the coordinates. ESMValTool tries to fix them automatically,
631666
but since no units are defined on the coordinates, this fails.
632667

633668
The cmorizer utilities also include a function called `fix_coords`, but before
@@ -684,7 +719,7 @@ The next error is:
684719

685720
~~~
686721
esmvalcore.cmor.check.CMORCheckError: There were errors in variable GPP:
687-
Variable GPP units unknown can not be converted to kg m-2 s-1 in cube:
722+
GPP: units should be kg m-2 s-1, not unknown
688723
~~~
689724
{: .error}
690725

0 commit comments

Comments
 (0)