Pipeline serialization to config improvements by tristan-deep · Pull Request #288 · tue-bmd/zea

tristan-deep · 2026-03-12T15:34:48Z

Fixes issue in #280. Generale improvements to pipeline saving.

If saved to config, now always in the form of:

pipeline:
    operations:
        - <operation_name>
        - ...

So with the pipeline as top level key. Saving and loading is now symmetric, and also added Pipeline.from_path for consistency with Model.from_path and Config.from_path. Lastly, defaults arguments in Pipeline are not automatically saved (for cleaner config) unless specified with verbose. See an example below:

import zea

config_path = "hf://zeahub/picmus/config_iq.yaml"
config = zea.Config.from_path(config_path)

pipeline = zea.Pipeline.from_config(config)
pipeline.to_yaml("pipeline_config.yaml")

# Now the pipeline is saved, there are two options to load it again
# which all should produce the same pipeline as the original one.
# 1. Load the pipeline directly from a path (local or HF yaml file)
# 2. Load the config from a path and then load the pipeline from the config

pipeline_2 = zea.Pipeline.from_yaml("pipeline_config.yaml")
config_2 = zea.Config.from_path("pipeline_config.yaml")

pipeline_3 = zea.Pipeline.from_config(config_2)

print(config.pipeline)
print(config_2.pipeline)

print(pipeline)
print(pipeline_2)
print(pipeline_3)

Also addressed #289, now added the class variable ADD_OUTPUT_KEYS.

Lastly removed Merge, Stack and BranchedPipeline operations, as they were not being maintained anymore. See issues #207, #206 and #209.

Consistency API saving and loading from configs

Now we have the same syntax for loading from a path:
zea.Model.from_path, zea.Pipeline.from_path, zea.Config.from_path

And saving:
zea.Model.to_json (Keras API), zea.Pipeline.to_yaml (or to_json), zea.Config.to_yaml (or to_json)

Summary by CodeRabbit

Breaking Changes
- Pipeline configs now require a top-level "pipeline" wrapper; branched pipelines and some legacy pipeline loaders were removed/deprecated.
New Features
- Added a file-based pipeline loader and compact/verbose serialization modes enabling reliable YAML/JSON/config round-trips.
Improvements
- Better serialization fidelity, clearer error and representation messages, and more consistent handling of HF-style config paths.
Documentation
- Docs updated to use unified path-based config loading (including HF URIs).

* Also Beamform and Map serialization fixes

coderabbitai · 2026-03-12T15:37:52Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

Walkthrough

Pipeline config now expects a wrapper {"pipeline": {"operations": [...]}}, serialization gained a verbose mode that preserves user JIT kwargs, Pipeline.from_path was added (pipeline_from_yaml deprecated), Merge/Stack/BranchedPipeline removed from ops exports, CommonMidpointPhaseError added, and tests/docs updated to match HF/config path plumbing.

Changes

Cohort / File(s)	Summary
Tests `tests/test_ops_infra.py`, `tests/test_configs.py`, `tests/tools/test_hf.py`	Updated tests to use `Pipeline`, adapt configs to the nested `{"pipeline": {...}}` shape, remove branched-pipeline tests, add round-trip/verbose serialization and Beamform/Map/PatchedGrid checks; added HF kwargs forwarding tests and adjusted HF test helpers to accept `repo_type`.
Public exports & ops package `zea/ops/__init__.py`	Removed `Merge`, `Stack`, `BranchedPipeline` from exports; added `CommonMidpointPhaseError`; updated `__all__` and examples to reference `Pipeline.from_path` and the nested pipeline contract.
Pipeline implementation & serialization `zea/ops/pipeline.py`	Added `Pipeline.from_path` and deprecation wrapper for YAML loader; centralized serialization with `_pipeline_to_serializable_dict`; introduced verbose-controlled `get_dict`/`to_config`/`to_json`/`to_yaml`; propagate verbose to nested ops and preserve user JIT kwargs; removed branched-pipeline logic.
Operation base behavior `zea/ops/base.py`	Added `_to_native` helper, store user-provided `jit_kwargs` as `_user_jit_kwargs`; changed `Operation.get_dict(self, verbose=False)` to compact vs verbose semantics; removed `Merge` and `Stack` classes; equality now uses new serialization semantics.
Interface and pipeline input `zea/interface.py`	`Pipeline.from_config` now receives the full `self.config` wrapper instead of `self.config.pipeline`.
Tensor op minor rename `zea/ops/tensor.py`	Renamed Threshold internal attribute `_fill_value_type` → `fill_value` and updated resolver usage.
Config, HF helpers & setup plumbing `zea/config.py`, `zea/data/preset_utils.py`, `zea/internal/setup_zea.py`, docs/...	Added `Config.from_path(...)` supporting local and hf:// paths; deprecated `from_yaml`/`from_hf`; propagated `repo_type`/**kwargs through HF helpers and setup functions; docs/examples switched to `from_path`; HF-related helpers/tests updated.
Docs `docs/source/getting-started.rst`, `docs/source/parameters.rst`, `docs/source/parameters_doc.py`, `zea/scan.py`	Example/doctest updates to use `Config.from_path(...)` and HF-style URIs instead of `from_yaml`/`from_hf`.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Test for checking ops API imports #248: Modifies zea.ops export surface and corresponding tests—overlaps with export and test updates in this change.
Improve verasonics conversion script #211: Changes HF helper functions like _hf_resolve_path/download helpers—related to repo_type/**kwargs plumbing added here.
Harmonic imaging support #223: Alters Pipeline and serialization paths—closely related to pipeline/from_path and verbose serialization changes.

Suggested reviewers

wesselvannierop
swpenninga
vincentvdschaft

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately reflects the main change of improving pipeline serialization to config with the new nested pipeline key structure, verbose mode control, and from_path method.
Docstring Coverage	✅ Passed	Docstring coverage is 83.70% which is sufficient. The required threshold is 80.00%.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feature/pipeline-config

📝 Coding Plan

Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codecov · 2026-03-12T15:48:45Z

Codecov Report

❌ Patch coverage is 81.10236% with 48 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
zea/ops/base.py	76.74%	9 Missing and 11 partials ⚠️
zea/ops/pipeline.py	85.12%	7 Missing and 11 partials ⚠️
zea/ops/ultrasound.py	50.00%	5 Missing and 1 partial ⚠️
zea/config.py	83.33%	2 Missing ⚠️
tests/tools/test_hf.py	87.50%	1 Missing ⚠️
zea/data/preset_utils.py	90.00%	1 Missing ⚠️

📢 Thoughts on this report? Let us know!

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@zea/ops/base.py`:
- Around line 30-31: The current guard uses hasattr(value, "ndim") but does not
ensure value.tolist() exists, risking an AttributeError; change the check around
the tolist call so you only call value.tolist() when it exists and is callable
(e.g., if hasattr(value, "tolist") and callable(getattr(value, "tolist", None)))
or wrap the call in a try/except AttributeError fallback; update the code path
that returns value.tolist() (the line containing value.tolist()) accordingly so
custom objects without tolist won't raise.
- Around line 306-310: When building params in the serialization path (the block
that calls _to_native and assigns params[name] = value), detect callable values
that are not safe for config export (e.g., functools.partial or other callable
subclasses like a bound method) and fail fast: before adding to params, if
callable(value) and value is not a plain importable reference (or does not
round-trip via YAML/JSON), raise a clear TypeError indicating the parameter
(name) is not serializable and must be replaced with a serializable reference;
implement the check right after calling _to_native and before assigning into
params (use the existing names value and name) so required ctor-introspected
params such as Lambda.func are rejected instead of being dumped verbatim.
- Around line 313-336: The serializer omits the Operation attribute
additional_output_keys, so reconstructed Operation instances can lose declared
extra outputs; update the serialization block in Operation (in methods around
where params is built — references: self.additional_output_keys, self.key,
self.output_key, output_keys) to emit params["additional_output_keys"] =
self.additional_output_keys in verbose mode and in compact mode add
params["additional_output_keys"] only when self.additional_output_keys is
non-empty (similar to the existing conditionals for cache/ jit fields) so
round-trip deserialization preserves extra outputs.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: a962c4d3-b217-4a49-9df5-791e1f2ed9b8

📥 Commits

Reviewing files that changed from the base of the PR and between ec131ed and 4cda24f.

📒 Files selected for processing (1)

zea/ops/base.py

zea/ops/base.py

* Focus on Config.from_path * Fixes rabbit

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (1)

tests/test_ops_infra.py (1)
106-172: Please retain coverage for legacy flat operations config.

Current fixtures now only exercise the wrapped format. Since pipeline_from_config still supports top-level {"operations": [...]}, add one regression test for that branch to prevent accidental breakage.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_ops_infra.py` around lines 106 - 172, Add a regression test that
exercises the legacy flat {"operations": [...]} branch of pipeline_from_config:
create a small legacy_config = {"operations":
[{"name":"multiply"},{"name":"add"}]}, call pipeline_from_config(legacy_config)
and assert the returned pipeline contains the expected operations (e.g.,
operation names "multiply" and "add" or equivalent behavior); place this test in
tests/test_ops_infra.py (e.g., test_pipeline_from_legacy_operations) so the
legacy path is covered alongside the existing wrapped-format fixtures.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@zea/ops/pipeline.py`:
- Around line 123-127: The code currently replaces user-provided jit_kwargs when
keras.backend.backend() == "jax" and self.static_params != [], losing options
stored in jit_kwargs; instead merge the static_argnames into the existing
jit_kwargs (referencing self._user_jit_kwargs, jit_kwargs, self.static_params
and the keras.backend.backend() check) so user keys are preserved—e.g., create
an updated dict that keeps all existing jit_kwargs and adds/overwrites only the
"static_argnames" key with self.static_params (or call
jit_kwargs.update({"static_argnames": self.static_params})), rather than
assigning jit_kwargs = {"static_argnames": ...}.
- Around line 1319-1325: The serializer _pipeline_to_serializable_dict currently
only writes operations so top-level Pipeline attributes (e.g., with_batch_dim,
jit_options, jit_kwargs, name) are dropped; update
_pipeline_to_serializable_dict to include these pipeline-level fields in the
returned dict (e.g., add keys for with_batch_dim, jit_options, jit_kwargs, name
populated from the pipeline instance) while still using
Pipeline._pipeline_to_list(...) for operations, and ensure the corresponding
deserializer (pipeline_from_config) will read those keys back to reconstruct the
original Pipeline settings.
- Around line 1351-1355: Replace the YAML emitter to produce portable YAML:
change the yaml.dump call that serializes
_pipeline_to_serializable_dict(pipeline, verbose=verbose) to yaml.safe_dump and
remove the Dumper=yaml.Dumper argument (keep other params like indent and file
handle). This ensures the serialized output (from the pipeline variable via
_pipeline_to_serializable_dict) can be read back by the existing yaml.safe_load
used earlier in the file (around the yaml.safe_load usage) and prevents
Python-specific tags like !!python/tuple from being emitted.

---

Nitpick comments:
In `@tests/test_ops_infra.py`:
- Around line 106-172: Add a regression test that exercises the legacy flat
{"operations": [...]} branch of pipeline_from_config: create a small
legacy_config = {"operations": [{"name":"multiply"},{"name":"add"}]}, call
pipeline_from_config(legacy_config) and assert the returned pipeline contains
the expected operations (e.g., operation names "multiply" and "add" or
equivalent behavior); place this test in tests/test_ops_infra.py (e.g.,
test_pipeline_from_legacy_operations) so the legacy path is covered alongside
the existing wrapped-format fixtures.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: e77014e4-56d5-429d-bf1a-50211cf447e0

📥 Commits

Reviewing files that changed from the base of the PR and between 4cda24f and 2c7e526.

📒 Files selected for processing (11)

docs/source/getting-started.rst
docs/source/parameters.rst
docs/source/parameters_doc.py
tests/test_configs.py
tests/test_ops_infra.py
zea/config.py
zea/data/preset_utils.py
zea/internal/setup_zea.py
zea/ops/base.py
zea/ops/pipeline.py
zea/scan.py

zea/ops/pipeline.py

coderabbitai

🧹 Nitpick comments (1)

zea/data/preset_utils.py (1)

77-84: Consider adding type hint for files parameter.

The return type is list[str], but the files parameter is typed as bare list. For consistency and clarity, consider using list[str].

Suggested fix

 def _download_files_in_path(
     repo_id: str,
-    files: list,
+    files: list[str],
     path_filter: str = None,
     cache_dir=HF_DATASETS_DIR,
     repo_type="dataset",
     **kwargs,
 ) -> list[str]:

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@zea/data/preset_utils.py` around lines 77 - 84, The files parameter in
_download_files_in_path is currently untyped; update its annotation from bare
list to list[str] (i.e., change the function signature of
_download_files_in_path to use files: list[str]) so the parameter and return
types are consistent and clearer; keep other parameters and return type
list[str] unchanged.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@zea/data/preset_utils.py`:
- Around line 77-84: The files parameter in _download_files_in_path is currently
untyped; update its annotation from bare list to list[str] (i.e., change the
function signature of _download_files_in_path to use files: list[str]) so the
parameter and return types are consistent and clearer; keep other parameters and
return type list[str] unchanged.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 89c8d8c3-6cbd-4271-89c8-e69aefa545a6

📥 Commits

Reviewing files that changed from the base of the PR and between 2c7e526 and a69f7dd.

📒 Files selected for processing (2)

tests/tools/test_hf.py
zea/data/preset_utils.py

zea/ops/base.py

tests/test_ops_infra.py

Replaces 'verbose' with 'compact' in serialization and round-trip methods, clarifying the behavior for minimal or full config output. Moves additional output key definitions to class-level attributes, ensuring they are not serialized as params. Strengthens pipeline config validation and preserves pipeline-level kwargs in round-trips. Improves YAML portability and JAX static_argnames merging logic.

… feature/pipeline-config

Catch lambda serialization error Fix notebook Even compacter config (strips)

wesselvannierop

Awesome PR, this will greatly improve the sharing of RF data & the particular processing associated with it!

tristan-deep added 4 commits March 12, 2026 14:39

Removed Merge, Stack and BranchedPipeline

4d9892c

export non defaults in config to dict only

136509c

improved pipeline serialization and tests

ee5ab5d

* Also Beamform and Map serialization fixes

force pipeline key as top level

53202d6

tristan-deep added this to the v0.0.11 milestone Mar 12, 2026

tristan-deep added bug Something isn't working enhancement New feature or request labels Mar 12, 2026

tristan-deep requested review from a team and swpenninga and removed request for a team March 12, 2026 15:36

Merge branch 'main' into feature/pipeline-config

195381f

fix init params save to config

ec131ed

tristan-deep requested a review from wesselvannierop March 12, 2026 18:33

tristan-deep added 2 commits March 12, 2026 21:32

fix failing test

c0c90b4

Merge remote-tracking branch 'origin/main' into feature/pipeline-config

4cda24f

coderabbitai bot reviewed Mar 12, 2026

View reviewed changes

zea/ops/base.py Outdated Show resolved Hide resolved

zea/ops/base.py Show resolved Hide resolved

zea/ops/base.py Outdated Show resolved Hide resolved

Improved testing

2c7e526

* Focus on Config.from_path * Fixes rabbit

coderabbitai bot reviewed Mar 12, 2026

View reviewed changes

zea/ops/pipeline.py Show resolved Hide resolved

zea/ops/pipeline.py Outdated Show resolved Hide resolved

zea/ops/pipeline.py Outdated Show resolved Hide resolved

fix hf tests

a69f7dd

coderabbitai bot reviewed Mar 12, 2026

View reviewed changes

wesselvannierop reviewed Mar 13, 2026

View reviewed changes

zea/ops/base.py Outdated Show resolved Hide resolved

wesselvannierop mentioned this pull request Mar 13, 2026

additional_output_keys should be defined as a class-level variable #289

Closed

wesselvannierop reviewed Mar 13, 2026

View reviewed changes

tests/test_ops_infra.py Outdated Show resolved Hide resolved

type annotations

37957fd

tristan-deep added 2 commits March 13, 2026 10:55

Merge branch 'feature/pipeline-config' of github.com:tue-bmd/zea into…

efa3c0e

… feature/pipeline-config

tristan-deep linked an issue Mar 13, 2026 that may be closed by this pull request

additional_output_keys should be defined as a class-level variable #289

Closed

tristan-deep and others added 4 commits March 13, 2026 12:20

Fix keras_ops serialization

0347c6a

Catch lambda serialization error Fix notebook Even compacter config (strips)

final cleanups

56b3e3d

Simplify filter key for lowpass and bandpass filter

2442e83

linter

330b74b

tristan-deep requested a review from wesselvannierop March 13, 2026 13:03

Also check different args yield different pipelines

02c1c69

wesselvannierop approved these changes Mar 13, 2026

View reviewed changes

tristan-deep merged commit 3d06e3d into main Mar 13, 2026
10 checks passed

tristan-deep deleted the feature/pipeline-config branch March 13, 2026 13:53

tristan-deep mentioned this pull request Mar 13, 2026

Release v0.0.11 #293

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pipeline serialization to config improvements#288

Pipeline serialization to config improvements#288
tristan-deep merged 18 commits intomainfrom
feature/pipeline-config

tristan-deep commented Mar 12, 2026 •

edited

Loading

Uh oh!

coderabbitai bot commented Mar 12, 2026 •

edited

Loading

Reviews paused

Uh oh!

codecov bot commented Mar 12, 2026 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

wesselvannierop left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

tristan-deep commented Mar 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Consistency API saving and loading from configs

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Uh oh!

codecov bot commented Mar 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

wesselvannierop left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tristan-deep commented Mar 12, 2026 •

edited

Loading

coderabbitai bot commented Mar 12, 2026 •

edited

Loading

codecov bot commented Mar 12, 2026 •

edited

Loading