Skip to content

Conversation

@bouweandela
Copy link
Member

@bouweandela bouweandela commented Mar 16, 2025

Description

Improve support for CMORizing data for obs4MIPs

  • Automatically fill various global attributes
  • Check that required global attributes are present and valid

Using a more recent version of the obs4MIPs CMOR tables than what is shipped with ESMValCore is recommended. If information is missing, the tables will need to be updated before running the CMORizer.

To make this work I had to fix some formatting issues in the file Tables/obs4MIPs_coordinate-ERA5levs.json.

Note: I tried to use the required global attributes from https://github.com/PCMDI/obs4MIPs-cmor-tables/blob/94d38431fbf3f9ad9e722cdaf8aab6050a440aa0/obs4MIPs_required_global_attributes.json, but they are not up to date with the specification.

Example configuration file modelled after the obs4MIPs ESA CCI SST dataset (using our script for reformatting a newer version of the same dataset):

---
# Common global attributes for Cmorizer output

attributes:
  activity_id: obs4MIPs
  contact: "Christopher Merchant, University of Reading, U.K. ([email protected])"
  dataset_contributor: "Add some text here"
  doi: "10.1038/s41597-024-03147-w"
  grid: "1x1 degree latitude x longitude"
  grid_label: gn
  license: "Data in this file produced by the University of Reading is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License (https://creativecommons.org/licenses/). Use of the data must be acknowledged following guidelines found at https://doi.org/10.1038/s41597-019-0236-x . Further information about this data, including some limitations, can be found via https://doi.org/10.1038/s41597-019-0236-x"
  nominal_resolution: "100 km"
  product: observations
  realm: ocean
  variant_label: UReading-BE

# Variables to cmorize (here use only filename ending)
variables:
  tos:
    mip: [Oday, Omon]
    raw: analysed_sst
    frequency: day
    filename: ESACCI-L4_GHRSST-SSTdepth-OSTIA-GLOB_CDR3.0-v02.0-fv01.0.nc
    start_year: 1980
    end_year: 1980

which works with our ESA CCI SST CMORizer after minor modifications (click details to see the changes).

--- a/esmvaltool/cmorizers/data/formatters/datasets/esacci_sst.py
+++ b/esmvaltool/cmorizers/data/formatters/datasets/esacci_sst.py
@@ -31,7 +31,6 @@ from ...utilities import (
     fix_coords,
     fix_var_metadata,
     save_variable,
-    set_global_atts,
 )
 
 logger = logging.getLogger(__name__)
@@ -110,8 +109,6 @@ def get_monthly_cube(
     time = cube.coord("time")
     time.bounds = get_time_bounds(time, vals["frequency"])
 
-    # set global attributes
-    set_global_atts(cube, attrs)
     # add comment to tosStderr
     if var == "tosStderr":
         cube.attributes["comment"] = (
@@ -140,6 +137,7 @@ def cmorization(in_dir, out_dir, cfg, cfg_user, start_date, end_date):
             logger.info("Processing year %s", year)
             glob_attrs["mip"] = vals["mip"][0]
             for month in range(start_date.month, end_date.month + 1):
+                glob_attrs["frequency"] = vals["frequency"]
                 monthly_cube = get_monthly_cube(
                     cfg,
                     var,
@@ -175,6 +173,7 @@ def cmorization(in_dir, out_dir, cfg, cfg_user, start_date, end_date):
             if "Stderr" not in var:
                 yearly_cube = concatenate(mon_cubes)
                 glob_attrs["mip"] = vals["mip"][1]
+                glob_attrs["frequency"] = "mon"
                 save_variable(
                     yearly_cube,
                     var,

Example output

ncdump -h /home/bandela/esmvaltool_output/data_formatting_20250403_121207/obs4MIPs/UReading/ESA-CCI-SST-v2-1/mon/tos/100km/v20250403/tos_mon_ESA-CCI-SST-v2-1_UReading-BE_gn_198001-198002.nc
netcdf tos_mon_ESA-CCI-SST-v2-1_UReading-BE_gn_198001-198002 {
dimensions:
	time = UNLIMITED ; // (2 currently)
	lat = 360 ;
	lon = 720 ;
	bnds = 2 ;
variables:
	float tos(time, lat, lon) ;
		tos:_FillValue = 1.e+20f ;
		tos:standard_name = "sea_surface_temperature" ;
		tos:long_name = "Sea Surface Temperature" ;
		tos:units = "degC" ;
		tos:cell_methods = "month_number: year: mean" ;
	double time(time) ;
		time:axis = "T" ;
		time:bounds = "time_bnds" ;
		time:units = "days since 1950-1-1 00:00:00" ;
		time:standard_name = "time" ;
		time:long_name = "time" ;
		time:calendar = "standard" ;
	double time_bnds(time, bnds) ;
	double lat(lat) ;
		lat:axis = "Y" ;
		lat:bounds = "lat_bnds" ;
		lat:units = "degrees_north" ;
		lat:standard_name = "latitude" ;
		lat:long_name = "latitude" ;
	double lat_bnds(lat, bnds) ;
	double lon(lon) ;
		lon:axis = "X" ;
		lon:bounds = "lon_bnds" ;
		lon:units = "degrees_east" ;
		lon:standard_name = "longitude" ;
		lon:long_name = "longitude" ;
	double lon_bnds(lon, bnds) ;

// global attributes:
		:Conventions = "CF-1.7" ;
		:activity_id = "obs4MIPs" ;
		:contact = "Christopher Merchant, University of Reading, U.K. ([email protected])" ;
		:creation_date = "2025-04-03T12:12:07Z" ;
		:data_specs_version = "2.5" ;
		:dataset_contributor = "Add some text here" ;
		:doi = "10.1038/s41597-024-03147-w" ;
		:frequency = "mon" ;
		:grid = "1x1 degree latitude x longitude" ;
		:grid_label = "gn" ;
		:institution = "University of Reading, Reading, UK" ;
		:institution_id = "UReading" ;
		:license = "Data in this file produced by the University of Reading is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License (https://creativecommons.org/licenses/). Use of the data must be acknowledged following guidelines found at https://doi.org/10.1038/s41597-019-0236-x . Further information about this data, including some limitations, can be found via https://doi.org/10.1038/s41597-019-0236-x" ;
		:nominal_resolution = "100 km" ;
		:processing_code_location = "https://github.com/ESMValGroup/ESMValTool/tree/2.13.0" ;
		:product = "observations" ;
		:realm = "ocean" ;
		:references = "10.1038/s41597-024-03147-w" ;
		:region = "global_ocean" ;
		:source = "ESA CCI SST v2.1 (2019): Sea Surface Temperature (SST) from the European Space Agency Climate Change Initiative (ESA CCI)" ;
		:source_id = "ESA-CCI-SST-v2-1" ;
		:source_label = "ESA-CCI-SST" ;
		:source_type = "satellite_retrieval" ;
		:source_version_number = "v2.1" ;
		:tracking_id = "hdl:21.14102/0e4530dd-fafa-4eaf-ac2c-09bc7e0290cc" ;
		:variable_id = "tos" ;
		:variant_label = "UReading-BE" ;
}

TODO:

  • Add tests

  • Update documentation/add instructions on how to CMORize obs4MIPs dataset

  • Closes #issue_number

  • Link to documentation:


Before you get started

Checklist

It is the responsibility of the author to make sure the pull request is ready to review. The icons indicate whether the item will be subject to the 🛠 Technical or 🧪 Scientific review.


To help with the number of pull requests:

@bouweandela bouweandela force-pushed the improve-obs4mips-support branch from 7ff41c2 to 1a24aa0 Compare April 3, 2025 14:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants