Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
69 changes: 43 additions & 26 deletions doc/quickstart/configure.rst
Original file line number Diff line number Diff line change
Expand Up @@ -685,22 +685,26 @@ Example:

The following project-specific options are available:

+-------------------------------+----------------------------------------+-----------------------------+----------------------------------------+
| Option | Description | Type | Default value |
+===============================+========================================+=============================+========================================+
| ``data`` | Data sources are used to find input | :obj:`dict` | {} |
| | data and have to be configured before | | |
| | running the tool. See | | |
| | :ref:`config-data-sources` for | | |
| | details. | | |
+-------------------------------+----------------------------------------+-----------------------------+----------------------------------------+
| ``extra_facets`` | Extra key-value pairs ("*facets*") | :obj:`dict` | See |
| | added to datasets in addition to the | | :ref:`config-extra-facets-defaults` |
| | facets defined in the recipe. See | | |
| | :ref:`config-extra-facets` for | | |
| | details. | | |
+-------------------------------+----------------------------------------+-----------------------------+----------------------------------------+
.. list-table::
:widths: 15 50 15 20
:header-rows: 1

* - Option
- Description
- Type
- Default value
* - ``data``
- Data sources are used to find input data and have to be configured before running the tool. Refer to :ref:`config-data-sources` for details.
- :obj:`dict`
- ``{}``
* - ``extra_facets``
- Extra key-value pairs ("*facets*") added to datasets in addition to the facets defined in the recipe. Refer to :ref:`config-extra-facets` for details.
- :obj:`dict`
- Refer to :ref:`config-extra-facets-defaults`.
* - ``preprocessor_filename_template``
- A template defining the filenames to use for :ref:`preprocessed data <preprocessed_datasets>` when running a :ref:`recipe <recipe>`. Refer to :ref:`config-preprocessor-filename-template` for details.
- :obj:`str`
- Refer to :ref:`config-preprocessor-filename-template`.

.. _config-data-sources:

Expand Down Expand Up @@ -954,6 +958,30 @@ Default extra facets are specified in ``extra_facets_*.yml`` files located in
<https://github.com/ESMValGroup/ESMValCore/tree/main/esmvalcore/config/configurations/defaults>`__
directory.

.. _config-preprocessor-filename-template:

Preprocessor output filenames
-----------------------------

The filename to use for saving :ref:`preprocessed data <preprocessed_datasets>`
when running a :ref:`recipe <recipe>` is configured using ``preprocessor_filename_template``,
similar to the filename template in :class:`esmvalcore.io.local.LocalDataSource`.

Default values are provided in ``defaults/preprocessor_filename_template.yml``,
for example:

.. literalinclude:: ../configurations/defaults/preprocessor_filename_template.yml
:language: yaml
:caption: First few lines of ``defaults/preprocessor_filename_template.yml``
:end-before: # Observational

The facet names from the template are replaced with the facet values from the
recipe to create a filename. The extension ``.nc`` (and if applicable, a start
and end time) will automatically be appended to the filename.

If no ``preprocessor_filename_template`` is configured for a project, the facets
describing the dataset in the recipe, as stored in
:attr:`esmvalcore.dataset.Dataset.minimal_facets`, are used.

.. _config-esgf:

Expand Down Expand Up @@ -1113,18 +1141,9 @@ Example of the CMIP6 project configuration:
.. code-block:: yaml

CMIP6:
output_file: '{project}_{dataset}_{mip}_{exp}_{ensemble}_{short_name}'
cmor_type: 'CMIP6'
cmor_strict: true

Preprocessor output files
-------------------------

The filename to use for preprocessed data is configured using ``output_file``,
similar to the filename template in :class:`esmvalcore.io.local.LocalDataSource`.
Note that the extension ``.nc`` (and if applicable, a start and end time) will
automatically be appended to the filename.

.. _cmor_table_configuration:

Project CMOR table configuration
Expand Down Expand Up @@ -1233,13 +1252,11 @@ Example:

native6:
cmor_strict: false
output_file: '{project}_{dataset}_{type}_{version}_{mip}_{short_name}'
cmor_type: 'CMIP6'
cmor_default_table_prefix: 'CMIP6_'

ICON:
cmor_strict: false
output_file: '{project}_{dataset}_{exp}_{var_type}_{mip}_{short_name}'
cmor_type: 'CMIP6'
cmor_default_table_prefix: 'CMIP6_'

Expand Down
4 changes: 2 additions & 2 deletions doc/quickstart/find_data.rst
Original file line number Diff line number Diff line change
Expand Up @@ -834,5 +834,5 @@ a corresponding entry in the configuration file could look like:

The same replacement mechanism can be employed everywhere where tags can be
used, particularly in ``dirname_template`` and ``filename_template`` in
:class:`esmvalcore.io.local.LocalDataSource`, and in ``output_file`` in
:ref:`config-developer.yml <config-developer>`.
:class:`esmvalcore.io.local.LocalDataSource`, and in ``preprocessor_filename_template``
under :ref:`config-projects`.
2 changes: 1 addition & 1 deletion doc/quickstart/output.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ The location is determined by the ``output_dir`` :ref:`configuration option
<config_options>`, the recipe name, and the date and time, using the the
format: ``YYYYMMDD_HHMMSS``.

For instance, a typical output location would be:
For instance, a typical output location (:attr:`~esmvalcore.config.Session.session_dir`) would be:
``output_directory/recipe_ocean_amoc_20190118_1027/``

This is effectively produced by the combination:
Expand Down
8 changes: 2 additions & 6 deletions esmvalcore/_recipe/recipe.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,6 @@
GRIB_FORMATS,
_dates_to_timerange,
_get_multiproduct_filename,
_get_output_file,
_parse_period,
_truncate_dates,
)
Expand All @@ -38,6 +37,7 @@
MULTI_MODEL_FUNCTIONS,
PreprocessingTask,
PreprocessorFile,
_get_preprocessor_filename,
)
from esmvalcore.preprocessor._area import _update_shapefile_path
from esmvalcore.preprocessor._multimodel import _get_stat_identifier
Expand Down Expand Up @@ -678,11 +678,7 @@ def _get_preprocessor_products(
_schedule_for_download(input_datasets)
_log_input_files(input_datasets)
logger.info("Found input files for %s", dataset.summary(shorten=True))

filename = _get_output_file(
dataset.facets,
dataset.session.preproc_dir,
)
filename = _get_preprocessor_filename(dataset)
product = PreprocessorFile(
filename=filename,
attributes=dataset.facets,
Expand Down
38 changes: 25 additions & 13 deletions esmvalcore/config/_config_object.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,6 @@
import dask.config

import esmvalcore
from esmvalcore.config._config import load_config_developer
from esmvalcore.config._config_validators import (
_deprecated_options_defaults,
_deprecators,
Expand Down Expand Up @@ -129,10 +128,6 @@ def load_from_dirs(self, dirs: Iterable[str | Path]) -> None:
new_config_dict = self._get_config_dict_from_dirs(dirs)
self.clear()
self.update(new_config_dict)
# Add known projects from config-developer file while we still have it.
for project in load_config_developer(self["config_developer_file"]):
if project not in self["projects"]:
self["projects"][project] = {}
self.check_missing()

def reload(self) -> None:
Expand Down Expand Up @@ -268,12 +263,26 @@ class Session(ValidatedConfig):
_deprecated_defaults = _deprecated_options_defaults

relative_preproc_dir = Path("preproc")
"""Relative path to the preprocessor output directory, with respect to :attr:`session_dir`."""

relative_work_dir = Path("work")
"""Relative path to diagnostic script output directory, with respect to :attr:`session_dir`."""

relative_plot_dir = Path("plots")
"""Relative path to diagnostic script plot directory, with respect to :attr:`session_dir`."""

relative_run_dir = Path("run")
"""Relative path to the directory with information about the run, with respect to :attr:`session_dir`."""

relative_main_log = Path("run", "main_log.txt")
"""Relative path to the log file, with respect to :attr:`session_dir`."""

relative_main_log_debug = Path("run", "main_log_debug.txt")
"""Relative path to the debug log file, with respect to :attr:`session_dir`."""

relative_cmor_log = Path("run", "cmor_log.txt")
"""Relative path to the log file with CMOR check messages, with respect to :attr:`session_dir`."""

_relative_fixed_file_dir = Path("preproc", "fixed_files")

def __init__(self, config: dict, name: str = "session") -> None:
Expand Down Expand Up @@ -304,42 +313,45 @@ def set_session_name(self, name: str = "session") -> None:

@property
def session_dir(self):
"""Return session directory."""
"""Session directory.

This is a uniquely named directory inside the :ref:`output directory <outputdata>`.
"""
return self["output_dir"] / self.session_name

@property
def preproc_dir(self):
"""Return preproc directory."""
"""Directory with preprocessor output files."""
return self.session_dir / self.relative_preproc_dir

@property
def work_dir(self):
"""Return work directory."""
"""Directory with diagnostic script output files."""
return self.session_dir / self.relative_work_dir

@property
def plot_dir(self):
"""Return plot directory."""
"""Directory with diagnostic script plot files."""
return self.session_dir / self.relative_plot_dir

@property
def run_dir(self):
"""Return run directory."""
"""Directory containing information about the run."""
return self.session_dir / self.relative_run_dir

@property
def main_log(self):
"""Return main log file."""
"""Path to the log file."""
return self.session_dir / self.relative_main_log

@property
def main_log_debug(self):
"""Return main log debug file."""
"""Path to the debug log file."""
return self.session_dir / self.relative_main_log_debug

@property
def cmor_log(self):
"""Return CMOR log file."""
"""Path to the log file with CMOR check messages."""
return self.session_dir / self.relative_cmor_log

@property
Expand Down
1 change: 1 addition & 0 deletions esmvalcore/config/_config_validators.py
Original file line number Diff line number Diff line change
Expand Up @@ -375,6 +375,7 @@ def validate_projects(
options_for_project: dict[str, Callable[[Any], Any]] = {
"data": validate_dict, # TODO: try to create data sources here
"extra_facets": validate_dict,
"preprocessor_filename_template": validate_string,
}
for project, project_config in mapping.items():
for option, val in project_config.items():
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# Templates for the filenames used to write preprocessor output.
projects:
# ESGF projects.
CMIP3:
preprocessor_filename_template: "{project}_{institute}_{dataset}_{mip}_{exp}_{ensemble}_{short_name}"
CMIP5:
preprocessor_filename_template: "{project}_{dataset}_{mip}_{exp}_{ensemble}_{short_name}"
CMIP6:
preprocessor_filename_template: "{project}_{dataset}_{mip}_{exp}_{ensemble}_{short_name}_{grid}"
CORDEX:
preprocessor_filename_template: "{project}_{institute}_{dataset}_{rcm_version}_{driver}_{domain}_{mip}_{exp}_{ensemble}_{short_name}"
obs4MIPs:
preprocessor_filename_template: "{project}_{dataset}_{short_name}"
# Observational and reanalysis data that has been CMORized by ESMValTool according to the CMIP5 standard.
OBS:
preprocessor_filename_template: "{project}_{dataset}_{type}_{version}_{mip}_{short_name}"
# Observational and reanalysis data that has been CMORized by ESMValTool according to the CMIP6 standard.
OBS6:
preprocessor_filename_template: "{project}_{dataset}_{type}_{version}_{mip}_{short_name}"
# Observational and reanalysis data that can be read in its native format by ESMValCore.
native6:
preprocessor_filename_template: "{project}_{dataset}_{type}_{version}_{mip}_{short_name}"
# Data from various climate models in their native output format.
ACCESS:
preprocessor_filename_template: "{project}_{dataset}_{mip}_{exp}_{institute}_{sub_dataset}_{freq_attribute}_{short_name}"
CESM:
preprocessor_filename_template: "{project}_{dataset}_{case}_{gcomp}_{scomp}_{type}_{mip}_{short_name}"
EMAC:
preprocessor_filename_template: "{project}_{dataset}_{exp}_{channel}_{mip}_{short_name}"
ICON:
preprocessor_filename_template: "{project}_{dataset}_{exp}_{var_type}_{mip}_{short_name}"
IPSLCM:
preprocessor_filename_template: "{dataset}_{account}_{model}_{status}_{exp}_{simulation}_{freq}_{short_name}"
8 changes: 4 additions & 4 deletions esmvalcore/dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,8 +27,8 @@
)
from esmvalcore.config._data_sources import _get_data_sources
from esmvalcore.exceptions import InputFilesNotFound, RecipeError
from esmvalcore.io.local import _dates_to_timerange, _get_output_file
from esmvalcore.preprocessor import preprocess
from esmvalcore.io.local import _dates_to_timerange
from esmvalcore.preprocessor import _get_preprocessor_filename, preprocess

if TYPE_CHECKING:
from collections.abc import Iterable, Iterator, Sequence
Expand Down Expand Up @@ -815,7 +815,7 @@ def load(self) -> Cube:
supplementary_cube = supplementary_dataset._load() # noqa: SLF001
supplementary_cubes.append(supplementary_cube)

output_file = _get_output_file(self.facets, self.session.preproc_dir)
output_file = _get_preprocessor_filename(self)
cubes = preprocess(
[cube],
"add_supplementary_variables",
Expand All @@ -833,7 +833,7 @@ def _load(self) -> Cube:
msg = check.get_no_data_message(self)
raise InputFilesNotFound(msg)

output_file = _get_output_file(self.facets, self.session.preproc_dir)
output_file = _get_preprocessor_filename(self)
fix_dir_prefix = Path(
self.session._fixed_file_dir, # noqa: SLF001
self._get_joined_summary_facets("_", join_lists=True) + "_",
Expand Down
22 changes: 0 additions & 22 deletions esmvalcore/io/local.py
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,6 @@
from netCDF4 import Dataset

import esmvalcore.io.protocol
from esmvalcore.config._config import get_project_config
from esmvalcore.exceptions import RecipeError
from esmvalcore.iris_helpers import ignore_warnings_context

Expand Down Expand Up @@ -697,27 +696,6 @@ def _templates_to_regex(self) -> str:
return pattern


def _get_output_file(variable: dict[str, Any], preproc_dir: Path) -> Path:
"""Return the full path to the output (preprocessed) file."""
cfg = get_project_config(variable["project"])

# Join different experiment names
if isinstance(variable.get("exp"), (list, tuple)):
variable = dict(variable)
variable["exp"] = "-".join(variable["exp"])
outfile = _replace_tags(cfg["output_file"], variable)[0]
if "timerange" in variable:
timerange = variable["timerange"].replace("/", "-")
outfile = Path(f"{outfile}_{timerange}")
outfile = Path(f"{outfile}.nc")
return Path(
preproc_dir,
variable.get("diagnostic", ""),
variable.get("variable_group", ""),
outfile,
)


def _get_multiproduct_filename(attributes: dict, preproc_dir: Path) -> Path:
"""Get ensemble/multi-model filename depending on settings."""
relevant_keys = [
Expand Down
Loading