Feature/to xarray options by colonesej · Pull Request #66 · ecmwf/hat

colonesej · 2025-08-20T13:20:42Z

Description

Allow cli commands to run as python files. Testing and debugging are a bit easier this way. One can simply do

python -m pdb hat/tools/cli.py hat-extract-timeseries config.yml

of with vscode

launch.json

{
    "configurations": [
        {
            "name": "hat-timeseries-extractor",
            "type": "debugpy",
            "request": "launch",
            "program": "${workspaceFolder}/hat/cli.py",
            "console": "integratedTerminal",
            "args": [
                "hat-extract-timeseries",
                "${workspaceFolder}/config.yml",
            ],
            "stopOnEntry": false,
            "justMyCode": false,
        }
    ]
}

support to_xarray_options in grid options to control xarray dataset creation. Larger datasets would not be loaded lazily going out of memory quite easily for ensemble forecasts/reforecasts.

confuguration looks like

config.yml

station:
  file: "outlets.csv"
  # filters: None
  name: "ObsID"
  coords: 
    x: "LisfloodX"
    y: "LisfloodY"

grid:
  coord_x: "longitude"
  coord_y: "latitude"
  to_xarray_options:
    profile: "mars"
    chunks:
      "longitude": -1
      "latitude": -1
      "time": 1
      "number": 1
  source: 
    file:
      path: "fc.*.grb"
  # source:
  #   mars:
  #     class: ce
  #     date: "20230101"
  #     expver: 1
  #     hdate: ""
  #     levtype: sfc
  #     model: lisflood
  #     number: 1/to/51/by/1
  #     origin: ecmf
  #     param: "240023"
  #     step: "6/12/18/"
  #     stream: efrf
  #     time: "00:00:00"
  #     type: pf
  #     target: "pf_efas5_${date}.grb"

output:
  file: "lisflood_${YMD}.nc"

Update the expected config structure for extract-timeseries to be all key-value based and the previous one did not support mars type. The down side is having to know the arg names/call signature used in earthkit-data for each source type which can be annoying as they are not consistent, like, .from_source('json', filename) and .from_source('file', path)

Contributor Declaration

By opening this pull request, I affirm the following:

All authors agree to the Contributor License Agreement.
The code follows the project's coding standards.
I have performed self-review and added comments where needed.
I have added or updated tests to verify that my changes are effective and functional.
I have run all existing tests and confirmed they pass.

colonesej · 2025-08-20T14:34:03Z

just FYI, I manually created a pre-release module on the hpc named 0.8pre for Maliko, who needed the chunking option for processing ensemble forecasts data.

colonesej added 3 commits August 20, 2025 13:42

update toml config name for pytest

1761be0

allow running any clu command as file. Simpler tests and debugging

796c31f

support options for xarray creation, like chunking/lazy loading

82b5da3

colonesej requested review from Oisin-M and corentincarton and removed request for corentincarton August 20, 2025 13:20

colonesej and others added 3 commits August 20, 2025 17:50

make config key-value based only to support many different ekd sources

ae352d2

add logging. config at package level

7e0017c

input checks before trying to load data which may take time

869c725

colonesej merged commit 209beb0 into develop Oct 3, 2025
4 checks passed

colonesej deleted the feature/to_xarray_options branch October 3, 2025 14:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/to xarray options#66

Feature/to xarray options#66
colonesej merged 6 commits intodevelopfrom
feature/to_xarray_options

colonesej commented Aug 20, 2025 •

edited

Loading

Uh oh!

colonesej commented Aug 20, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

colonesej commented Aug 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Contributor Declaration

Uh oh!

colonesej commented Aug 20, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

colonesej commented Aug 20, 2025 •

edited

Loading