Skip to content

Performance improvement: recipe_easy_ipcc.yml #2300

@bouweandela

Description

@bouweandela

This issue keeps track of the performance improvements implemented for recipe_easy_ipcc.yml (documentation) as part of the ESiWACE3 service project. Example output is available here.

Settings

To run the recipe, the following settings are used:

~/.esmvaltool/config-user.yml

max_parallel_tasks: 1

~/.esmvaltool/dask.yml

cluster:
  type: dask_jobqueue.SLURMCluster
  queue: compute
  account: bd0854
  cores: 128
  memory: 256GiB
  processes: 32
  interface: ib0
  local_directory: /scratch/b/b381141/dask-tmp
  n_workers: 32
  walltime: '8:00:00'

Profiling

The baseline runtime is about 4 hours.

Conda environment used for profiling: environment.yml.

A smaller version of the recipe was used for profiling runs. It uses only the historical experiment with data between 1950 and 2000 (recipe file).

The profiles attached below were created with py-spy using the command

py-spy record \
--idle \
--rate 10 \
--subprocesses \
--format speedscope \
esmvaltool run examples/recipe_easy_ipcc.yml

and can be viewed with https://www.speedscope.app:

  1. Initial run: profile
  2. Faster coordinate comparisons for concatenate Faster and simpler iris.util.array_equal SciTools/iris#5610 and faster CMOR fixes Faster coordinate checks and longitude fix #2264: profile
  3. Faster cube printing and lazy cells Faster trivial equality checks for coordinates and arrays SciTools/iris#5691 and Make the Coord.cell method lazy SciTools/iris#5693: profile
  4. Do not realize cell measures and ancillary variables in concatenate SciTools/iris#6010, Parallel concatenate SciTools/iris#5926 and, Faster time coordinate categorization SciTools/iris#5999 first run profile second run profile
  5. main iris and ESMValCore branch on 2024-09-12 runs including Load esmvalcore.dataset.Dataset objects in parallel using Dask #2517 and Save all files in a task at the same time to avoid recomputing intermediate results #2522
    profile

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions