Skip to content

feat(models,training)!: multi dataset integration#594

Merged
JPXKQX merged 300 commits intomainfrom
feature/multi-dataset-integration
Jan 21, 2026
Merged

feat(models,training)!: multi dataset integration#594
JPXKQX merged 300 commits intomainfrom
feature/multi-dataset-integration

Conversation

@mchantry
Copy link
Member

@mchantry mchantry commented Oct 8, 2025

BEGIN_COMMIT_OVERRIDE
feat(models,training)!: multi dataset integration
END_COMMIT_OVERRIDE

Description

Supports multiple time-aligned datasets as inputs and outputs for training.
e.g.

era_t     |         era_{t+1}
          | - > 
cerra_t   |         cerra_{t+1}
era_t      |        era_{t+1}
           | - > 
           |        cerra_{t+1}
era_t     |       
          | - > 
cerra_t   |         cerra_{t+1}

where inputs/outputs each use their own encoder/decoder.

As a contributor to the Anemoi framework, please ensure that your changes include unit tests, updates to any affected dependencies and documentation, and have been tested in a parallel setting (i.e., with multiple GPUs). As a reviewer, you are also responsible for verifying these aspects and requesting changes if they are not adequately addressed. For guidelines about those please refer to https://anemoi.readthedocs.io/en/latest/

By opening this pull request, I affirm that all authors agree to the Contributor License Agreement.


📚 Documentation preview 📚: https://anemoi-training--594.org.readthedocs.build/en/594/


📚 Documentation preview 📚: https://anemoi-graphs--594.org.readthedocs.build/en/594/


📚 Documentation preview 📚: https://anemoi-models--594.org.readthedocs.build/en/594/

@github-project-automation github-project-automation bot moved this to To be triaged in Anemoi-dev Oct 8, 2025
@mchantry mchantry changed the title Feature/multi dataset integration feat(models,training): multi dataset integration Oct 8, 2025
@mchantry mchantry added the ATS Approved Approved by ATS label Oct 9, 2025
@dnerini dnerini moved this from To be triaged to Now In Progress in Anemoi-dev Oct 21, 2025
@radiradev
Copy link
Contributor

Hi @mchantry, I'm planning on doing some training with this branch. Could you let me know what I should expect to work and not work?

@anaprietonem anaprietonem self-requested a review January 21, 2026 12:36
anaprietonem
anaprietonem previously approved these changes Jan 21, 2026
Copy link
Contributor

@anaprietonem anaprietonem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing my previous comments, LGTM!

@github-project-automation github-project-automation bot moved this from Now In Progress to For merging in Anemoi-dev Jan 21, 2026
gmertes added a commit to ecmwf/anemoi-inference that referenced this pull request Jan 21, 2026
… `data` (#403)

## Description
Implements step 1 (checkpoint support) and part of step 1.5 (new
metadata) of #402

Add support for the new multi-dataset checkpoints, but **only** the
single-dataset case where the dataset name defaults to "data". Real
multi dataset support will follow later in step 2.

This PR should be merged before ecmwf/anemoi-core#594 , but does not
need to be merged exactly at the same time. This PR is forwards and
backwards compatible.

I started using some of the new metadata but not all, to keep things
simple. Improvements to the metadata will also be deferred to the next
round.

***As a contributor to the Anemoi framework, please ensure that your
changes include unit tests, updates to any affected dependencies and
documentation, and have been tested in a parallel setting (i.e., with
multiple GPUs). As a reviewer, you are also responsible for verifying
these aspects and requesting changes if they are not adequately
addressed. For guidelines about those please refer to
https://anemoi.readthedocs.io/en/latest/***

By opening this pull request, I affirm that all authors agree to the
[Contributor License
Agreement.](https://github.com/ecmwf/codex/blob/main/Legal/contributor_license_agreement.md)
@anaprietonem anaprietonem self-requested a review January 21, 2026 17:17
@JPXKQX JPXKQX merged commit f537d7f into main Jan 21, 2026
15 of 16 checks passed
@github-project-automation github-project-automation bot moved this from For merging to Done in Anemoi-dev Jan 21, 2026
@JPXKQX JPXKQX deleted the feature/multi-dataset-integration branch January 21, 2026 17:18
@DeployDuck DeployDuck mentioned this pull request Jan 21, 2026
@HCookie HCookie changed the title feat(models,training): multi dataset integration feat!(models,training): multi dataset integration Jan 21, 2026
@HCookie HCookie changed the title feat!(models,training): multi dataset integration feat(models,training)!: multi dataset integration Jan 21, 2026
OpheliaMiralles pushed a commit to OpheliaMiralles/anemoi-core that referenced this pull request Feb 16, 2026
Update the metadata we store in checkpoints in order to provide more
information to inference

The checkpoint mentioned
[here](ecmwf#594 (comment))
was created based on this branch.

***As a contributor to the Anemoi framework, please ensure that your
changes include unit tests, updates to any affected dependencies and
documentation, and have been tested in a parallel setting (i.e., with
multiple GPUs). As a reviewer, you are also responsible for verifying
these aspects and requesting changes if they are not adequately
addressed. For guidelines about those please refer to
https://anemoi.readthedocs.io/en/latest/***

By opening this pull request, I affirm that all authors agree to the
[Contributor License
Agreement.](https://github.com/ecmwf/codex/blob/main/Legal/contributor_license_agreement.md)

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.