Skip to content

Issue with the usage of wildcards to select range of datasets #4239

@jlenh

Description

@jlenh

I am trying to analyze some decadal prediction outputs from DCPP but I am running into some difficulties to select the datasets I want. I am drawing on the existing decadal example recipe.

When using a single ensemble member as per the example, I do not run into issues. But adding a wildcard inside the ensemble member facet to select a range of ensemble members ensemble: r*i1p1f1 makes the recipe fail. The timerange facet is set to timerange: '*' in order to select the whole time range provided for each sub-experiment (which changes for each sub-experiment and models).

Is this because different sub-experiments (ie forecast initialization years) might have a different set of members for each model? Or is it related to the timerange facet too? Or is it the combination of the ensemble and sub_experiment facets that need to be expanded by ESMValTool?

The recipe used is recipe_decadal.yml with the latest release of ESMValTool, and by just adding a wild card for the ensemble member:

# ESMValTool
---
documentation:
  title: Example recipe that loads DCPP data.
  description: |
    This is an example recipe to deal with DCPP data. Computes the global
    mean of tas and compares it against ERA-Interim for a set of timeranges.
    Reproduces the examples given in deliverable D9.4 of ISENES-3.
  authors:
    - loosveldt-tomas_saskia
  maintainer:
    - loosveldt-tomas_saskia
  projects:
    - isenes3


preprocessors:
  pptas:
    area_statistics:
      operator: 'mean'
diagnostics:
  first_example:
    additional_datasets:
      - &dcpp {dataset: EC-Earth3, project: CMIP6, exp: dcppA-hindcast, ensemble: r*i1p1f1,
               sub_experiment: 's(1980:2018)', timerange: '*'}
      - &obs {dataset: ERA-Interim, project: OBS6, type: reanaly, version: 1, tier: 3,
              timerange: '198011/201812'}
    variables:
      tas:
        grid: gr
        mip: Amon
        preprocessor: pptas
    scripts:
      first_example:
        script: examples/decadal_example.py

which produces the following error:

2025-11-10 12:54:58,356 UTC [2875695] INFO    ----------------------------------------------------------------------
2025-11-10 12:54:58,388 UTC [2875695] INFO    Running tasks using at most 32 processes
2025-11-10 12:54:58,388 UTC [2875695] INFO    If your system hangs during execution, it may not have enough memory for keeping this number of tasks in memory.
2025-11-10 12:54:58,388 UTC [2875695] INFO    If you experience memory problems, try reducing 'max_parallel_tasks' in your configuration.
2025-11-10 12:54:58,882 UTC [2875695] INFO    Maximum memory used (estimate): 0.0 GB
2025-11-10 12:54:58,882 UTC [2875695] INFO    Sampled every second. It may be inaccurate if short but high spikes in memory consumption occur.
2025-11-10 12:54:58,883 UTC [2875695] ERROR   Unable to replace ensemble=r*i1p1f1, timerange=* by a value for
Dataset:
{'dataset': 'EC-Earth3',
 'project': 'CMIP6',
 'mip': 'Amon',
 'short_name': 'tas',
 'ensemble': 'r*i1p1f1',
 'exp': 'dcppA-hindcast',
 'grid': 'gr',
 'preprocessor': 'pptas',
 'sub_experiment': 's1980',
 'timerange': '*'}
supplementaries:
  {'dataset': 'EC-Earth3',
   'project': 'CMIP6',
   'mip': '*',
   'short_name': 'areacella',
   'activity': '*',
   'ensemble': '*',
   'exp': '*',
   'grid': 'gr',
   'institute': '*',
   'sub_experiment': 's1980',
   'timerange': '*'}
session: 'recipe_decadal_20251110_125457'
Do the paths to the files:
/home/esgf/CMIP6/DCPP/EC-Earth-Consortium/EC-Earth3/dcppA-hindcast/s1980-r10i1p1f1/Amon/tas/gr/v20201216/tas_Amon_EC-Earth3_dcppA-hindcast_s1980-r10i1p1f1_gr_198011-198110.nc with facets: {'project': 'CMIP6', 'activity': 'DCPP', 'institute': 'EC-Earth-Consortium', 'dataset': 'EC-Earth3', 'exp': 'dcppA-hindcast', 'ensemble': 's1980-r10i1p1f1', 'mip': 'Amon', 'short_name': 'tas', 'grid': 'gr', 'version': 'v20201216'}
/home/esgf/CMIP6/DCPP/EC-Earth-Consortium/EC-Earth3/dcppA-hindcast/s1980-r10i1p1f1/Amon/tas/gr/v20201216/tas_Amon_EC-Earth3_dcppA-hindcast_s1980-r10i1p1f1_gr_198111-198210.nc with facets: {'project': 'CMIP6', 'activity': 'DCPP', 'institute': 'EC-Earth-Consortium', 'dataset': 'EC-Earth3', 'exp': 'dcppA-hindcast', 'ensemble': 's1980-r10i1p1f1', 'mip': 'Amon', 'short_name': 'tas', 'grid': 'gr', 'version': 'v20201216'}
/home/esgf/CMIP6/DCPP/EC-Earth-Consortium/EC-Earth3/dcppA-hindcast/s1980-r10i1p1f1/Amon/tas/gr/v20201216/tas_Amon_EC-Earth3_dcppA-hindcast_s1980-r10i1p1f1_gr_198211-198310.nc with facets: {'project': 'CMIP6', 'activity': 'DCPP', 'institute': 'EC-Earth-Consortium', 'dataset': 'EC-Earth3', 'exp': 'dcppA-hindcast', 'ensemble': 's1980-r10i1p1f1', 'mip': 'Amon', 'short_name': 'tas', 'grid': 'gr', 'version': 'v20201216'}
...
provide the missing facet values?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions