Skip to content

Commit 6f47493

Browse files
jinmannwongHCookie
andauthored
[fluent] Change container for nodes to xr.DataTree (#161)
* Initial switch to DataTree in fluent * Tidy * Add nodetree test * Add merge functionality * Use ruff * fix: str representation of payloads - issue when using lambdas in split * Fix selection,mean and std, add functionality to set path * Remove duplicated merge function * Don't raise if selection criteria fails * [fluent] feat: Add expand_as_qube (#162) --------- Co-authored-by: Harrison <harrison.cook@ecmwf.int>
1 parent d61c4b9 commit 6f47493

File tree

10 files changed

+1619
-285
lines changed

10 files changed

+1619
-285
lines changed

.github/workflows/macos-test.yml

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -17,17 +17,17 @@ jobs:
1717
fail-fast: true
1818
matrix:
1919
arch_type: ["macos-ARM64", "linux-x86"]
20-
python_version: ["3.10", "3.11", "3.12", "3.13"]
20+
python_version: ["3.11", "3.12", "3.13"]
2121
runs-on: "${{ fromJSON('{\"linux-x86\": [\"self-hosted\", \"Linux\", \"platform-builder-Rocky-8.6\"], \"macos-ARM64\": [\"self-hosted\", \"macOS\", \"ARM64\"]}')[matrix.arch_type] }}"
2222
timeout-minutes: 20
2323
steps:
2424
- uses: actions/checkout@v4
2525
- uses: actions/setup-python@v5
2626
with:
2727
python-version: ${{ inputs.python-version }}
28-
- uses: astral-sh/setup-uv@v6
29-
with:
30-
version: 0.7.19
28+
# - uses: astral-sh/setup-uv@v6
29+
# with:
30+
# version: 0.7.19
3131
- uses: extractions/setup-just@v3
3232
- run: |
3333
uv sync --python "${{ matrix.python_version }}"

docs/usage/qubed.rst

Lines changed: 221 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,221 @@
1+
#####################
2+
Qube Expansion
3+
#####################
4+
5+
The qube expansion system in ``earthkit-workflows`` provides tools
6+
for expanding actions across multi-dimensional data structures
7+
based on Qube definitions.
8+
9+
****************
10+
What is Qubed?
11+
****************
12+
13+
The expansion system is built on `Qubed
14+
<https://github.com/ecmwf/qubed>`_, a library for representing
15+
multi-dimensional data structures. A Qube represents the dimensions
16+
of your underlying data, defining the axes (dimensions) and their
17+
coordinate values, along with optional hierarchical relationships
18+
between different sets of dimensions.
19+
20+
Think of a Qube as representing how your data is organised across
21+
multiple dimensions (time steps, parameters, vertical levels, etc.).
22+
23+
- **Simple Qube**: Single set of dimensions (e.g., time steps: [6, 12, 18, 24])
24+
- **Hierarchical Qube**: Multiple branches with different dimension
25+
sets (e.g., surface variables vs. pressure level variables)
26+
27+
**************************
28+
Basic Expansion Workflow
29+
**************************
30+
31+
The ``expand_as_qube`` method on Action takes a qube structure and
32+
recursively expands the action across all dimensions defined in the qube.
33+
34+
Basic Example
35+
=============
36+
37+
.. code:: python
38+
39+
from qubed import Qube
40+
41+
# Create a simple qube with time steps
42+
qube = Qube.from_datacube({"step": [6, 12, 18, 24, 30]})
43+
44+
# View the dimensions
45+
print(qube.axes())
46+
# Output: {'step': array([ 6, 12, 18, 24, 30])}
47+
48+
# Expand an action across all dimensions
49+
expanded_action = action.expand_as_qube(qube)
50+
51+
The action will be expanded across the specified dimensions, creating separate
52+
execution paths for each coordinate value.
53+
54+
**************************
55+
Creating Qube Structures
56+
**************************
57+
58+
You can manually construct Qube structures to define how your actions
59+
should be expanded:
60+
61+
Simple Single-Dimension Expansion
62+
=================================
63+
64+
.. code:: python
65+
66+
from qubed import Qube
67+
68+
# Create a simple qube with time steps
69+
qube = Qube.from_datacube({"step": [6, 12, 18, 24]})
70+
71+
# Expand an action
72+
expanded_action = action.expand_as_qube(qube)
73+
74+
Multi-Dimensional Expansion
75+
===========================
76+
77+
.. code:: python
78+
79+
# Create a qube with multiple dimensions
80+
qube = Qube.from_datacube({
81+
"step": [6, 12, 18],
82+
"param": ["t", "q", "u", "v"],
83+
"level": [500, 850, 1000]
84+
})
85+
expanded_action = action.expand_as_qube(qube)
86+
87+
Hierarchical Expansion
88+
======================
89+
90+
Create hierarchical qube structures for different variable types:
91+
92+
.. code:: python
93+
94+
from qubed import Qube
95+
96+
# Surface variables (2D fields)
97+
surface = Qube.from_datacube({
98+
"param": ["2t", "2d", "10u", "10v", "msl"]
99+
})
100+
surface.add_metadata({"name": "surface"})
101+
102+
# Pressure level variables (3D fields)
103+
pressure = Qube.from_datacube({
104+
"param": ["t", "q", "u", "v"],
105+
"level": [500, 700, 850, 925, 1000]
106+
})
107+
pressure.add_metadata({"name": "pressure"})
108+
109+
# Combine with time steps
110+
steps = Qube.from_datacube({"step": [6, 12, 18, 24]})
111+
combined = steps | (surface | pressure)
112+
113+
# Expand the action
114+
expanded_action = action.expand_as_qube(combined)
115+
116+
The expanded action will have separate branches for ``/surface`` and
117+
``/pressure``, each containing the appropriate parameters and levels.
118+
119+
**Understanding Name Metadata**
120+
121+
When a qube has multiple children (branches), the expansion creates separate
122+
execution paths using the ``split()`` method. The path names for these branches
123+
are determined by:
124+
125+
1. **Named branches**: If a child qube has ``{"name": "..."}`` metadata, that
126+
name is used as the path (e.g., ``/surface``, ``/pressure``)
127+
128+
2. **Automatic naming**: If no name metadata is provided, branches are
129+
automatically named using alphabetical labels (``/a``, ``/b``, ``/c``, etc.)
130+
131+
.. code:: python
132+
133+
# Example: Automatic alphabetical naming
134+
child1 = Qube.from_datacube({"param": ["t", "q"]})
135+
child2 = Qube.from_datacube({"param": ["u", "v"]})
136+
parent = Qube.from_datacube({"step": [6, 12]})
137+
qube = parent | (child1 | child2)
138+
139+
expanded = action.expand_as_qube(qube)
140+
# Creates branches: /a (for child1) and /b (for child2)
141+
142+
# Example: Using meaningful names
143+
child1.add_metadata({"name": "temperature"})
144+
child2.add_metadata({"name": "wind"})
145+
qube = parent | (child1 | child2)
146+
147+
expanded = action.expand_as_qube(qube)
148+
# Creates branches: /temperature and /wind
149+
150+
The name metadata is particularly useful for organising complex workflows with
151+
multiple variable types, making the execution tree more readable and easier to
152+
debug.
153+
154+
*******************
155+
Modifying Qubes
156+
*******************
157+
158+
Dropping Axes
159+
=============
160+
161+
You can remove dimensions from a qube before expanding:
162+
163+
.. code:: python
164+
165+
# Create qube with multiple dimensions
166+
qube = Qube.from_datacube({
167+
"step": [6, 12, 18],
168+
"param": ["t", "q"],
169+
"level": [500, 850, 1000]
170+
})
171+
172+
# Drop the time step dimension
173+
qube_no_steps = qube.remove_by_key("step")
174+
expanded = action.expand_as_qube(qube_no_steps)
175+
176+
# Drop multiple dimensions
177+
qube_params_only = qube.remove_by_key(["step", "level"])
178+
expanded = action.expand_as_qube(qube_params_only)
179+
180+
Inspecting Axes
181+
===============
182+
183+
View available dimensions in a qube:
184+
185+
.. code:: python
186+
187+
qube = Qube.from_datacube({
188+
"step": [6, 12, 18],
189+
"param": ["t", "q"],
190+
"level": [500, 850, 1000]
191+
})
192+
193+
axes = qube.axes()
194+
195+
for axis_name, values in axes.items():
196+
print(f"{axis_name}: {sorted(values)}")
197+
198+
# Check if an axis exists
199+
if "level" in axes:
200+
print(f"Pressure levels: {sorted(axes['level'])}")
201+
202+
*************
203+
API Summary
204+
*************
205+
206+
**Action Methods**
207+
208+
- ``action.expand_as_qube(qube)``: Expand an action according to a qube structure
209+
210+
**Qube Methods**
211+
212+
- ``Qube.from_datacube(dims)``: Create a qube from dimension dictionary
213+
- ``qube.axes()``: View available dimensions
214+
- ``qube.remove_by_key(key)``: Remove dimension(s)
215+
- ``qube.add_metadata(metadata)``: Add metadata (e.g., names) to qube nodes
216+
217+
**See Also**
218+
219+
- :doc:`/api/fluent` - Fluent API documentation
220+
- `Qubed Documentation <https://qubed.readthedocs.io/>`_ - Underlying
221+
data structure library

pyproject.toml

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ authors = [
1818
]
1919
license = "Apache-2.0"
2020
license-files = ["LICENSE"]
21-
requires-python = ">=3.10"
21+
requires-python = ">=3.11"
2222
dependencies = [
2323
"earthkit-data",
2424
"cloudpickle",
@@ -34,6 +34,7 @@ dependencies = [
3434
"pyzmq",
3535
"fire",
3636
"orjson",
37+
"qubed>=0.3.0",
3738
]
3839
# version provided via setuptools_scm module, which derives it from git tag
3940
dynamic = ["version"]
@@ -42,6 +43,7 @@ readme = "README.md"
4243
[dependency-groups]
4344
dev = ["pytest", "pytest-xdist>=3.8", "prek", "ty==0.0.2", "build", "bokeh"]
4445

46+
4547
[tool.setuptools]
4648
include-package-data = true
4749
zip-safe = false

0 commit comments

Comments
 (0)