JOSS review @tomelse by tristan-deep · Pull Request #283 · tue-bmd/zea

tristan-deep · 2026-03-09T16:38:00Z

This PR addresses the JOSS review comments and feedback by @tomelse. See a summary of the review here: openjournals/joss-reviews#9881 (comment)

And the related issues:

Please find an updated version of the docs right here: https://zea--283.org.readthedocs.build/en/283/

Summary by CodeRabbit

Documentation
- Updated example notebook: clarified TensorFlow installation requirement for batch loading, added prominent migration notes, separated explanatory text, added keras import and explicit image-to-NumPy conversion, and included a new dataloader usage and visualization snippet.
Improvements
- Added and refined type annotations on public-facing methods to improve clarity and IDE support while preserving runtime behavior.

coderabbitai · 2026-03-09T16:38:18Z

Walkthrough

Added return type annotations to Pipeline and data-file public methods; updated the zea_data_example notebook to clarify TensorFlow dependency, add keras imports, convert images to NumPy in examples, and reorganize dataloader usage and explanatory text.

Changes

Cohort / File(s)	Summary
Notebook updates `docs/source/notebooks/data/zea_data_example.ipynb`	Clarified that `zea.backend.tensorflow.make_dataloader` requires TensorFlow; added `keras` imports and new dataloader usage blocks; inserted `keras.ops.convert_to_numpy(images)` conversion and adjusted explanatory/output text.
Type hints — Data module `zea/data/file.py`	Added return type annotations: `File.load_data()` -> `np.ndarray`; `load_file()` -> `(np.ndarray, Scan, Probe)`. No logic changes.
Type hints — Pipeline module `zea/ops/pipeline.py`	Imported typing (`Any`, `Dict`) and annotated `Pipeline.call(self, inputs) -> Dict[str, Any]` and `Pipeline.__call__(self, return_numpy=False, inputs) -> Dict[str, Any]`. Signatures only; no behavior changes.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested reviewers

wesselvannierop

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name	Status	Explanation	Resolution
Title check	❓ Inconclusive	The title 'JOSS review `@tomelse`' is vague and does not clearly summarize the main changes in the changeset, which include notebook updates, type annotation additions, and pipeline signature modifications.	Consider using a more descriptive title that summarizes the primary changes, such as 'Add type hints and update documentation for JOSS review' or 'Address JOSS review feedback with type annotations and notebook updates'.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feature/review-tomelse

📝 Coding Plan

Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codecov · 2026-03-09T16:52:37Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

zea/ops/pipeline.py (1)
297-317: ⚠️ Potential issue | 🟠 Major

Handle the empty-pipeline case before returning outputs.

outputs is only assigned inside the loop, so Pipeline([]) or a config with operations: [] still crashes here with UnboundLocalError. Either reject empty pipelines in __init__, or return the inputs unchanged when there are no callable layers.
🩹 Local fix
     def call(self, **inputs) -> Dict[str, Any]:
         """Process input data through the pipeline."""
+        if not self._callable_layers:
+            return inputs
+
         for operation in self._callable_layers:
             try:
                 outputs = operation(**inputs)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@zea/ops/pipeline.py` around lines 297 - 317, The call method can raise
UnboundLocalError when self._callable_layers is empty because outputs is only
set inside the loop; fix by handling the empty-pipeline case up front or
ensuring outputs is initialized: either check if not self._callable_layers and
return inputs immediately, or initialize outputs = inputs before the for-loop in
Pipeline.call so that when no operations exist the original inputs are returned;
update references to self._callable_layers and the call method accordingly.
zea/data/file.py (1)
171-226: ⚠️ Potential issue | 🟡 Minor

Broaden the return type or disallow scalar indexing.

Line 175 declares -> np.ndarray, but data[indices] can return a NumPy scalar when every axis is indexed by an integer. Since the signature accepts arbitrary integer tuples, the type contract is narrower than the actual runtime behavior. Either restrict indices to array-producing cases, or widen the return type to include numpy.generic:
✏️ Typing fix
-    ) -> np.ndarray:
+    ) -> np.ndarray | np.generic:
Mirror the same widened type in load_file(...) for its first tuple element.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@zea/data/file.py` around lines 171 - 226, The return annotation of load_data
currently declares -> np.ndarray but indexing with indices (the indices
parameter of load_data) can yield a NumPy scalar (numpy.generic) when all axes
are integer-indexed; update the type signature to reflect this by widening the
return type to Union[np.ndarray, numpy.generic] (or np.ndarray | numpy.generic)
and adjust any imports/typing aliases as needed; additionally mirror this
widened first-tuple-element return type in the related load_file function so its
corresponding returned element also allows numpy.generic.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/source/notebooks/data/zea_data_example.ipynb`:
- Around line 383-387: The notebook documents that
zea.backend.tensorflow.make_dataloader requires TensorFlow but doesn't install
it; add a new notebook cell immediately before the dataloader import/usage that
installs TensorFlow (run pip install tensorflow) and capture its output so the
subsequent import of zea.backend.tensorflow and the call to make_dataloader
succeed in a fresh Colab environment.

---

Outside diff comments:
In `@zea/data/file.py`:
- Around line 171-226: The return annotation of load_data currently declares ->
np.ndarray but indexing with indices (the indices parameter of load_data) can
yield a NumPy scalar (numpy.generic) when all axes are integer-indexed; update
the type signature to reflect this by widening the return type to
Union[np.ndarray, numpy.generic] (or np.ndarray | numpy.generic) and adjust any
imports/typing aliases as needed; additionally mirror this widened
first-tuple-element return type in the related load_file function so its
corresponding returned element also allows numpy.generic.

In `@zea/ops/pipeline.py`:
- Around line 297-317: The call method can raise UnboundLocalError when
self._callable_layers is empty because outputs is only set inside the loop; fix
by handling the empty-pipeline case up front or ensuring outputs is initialized:
either check if not self._callable_layers and return inputs immediately, or
initialize outputs = inputs before the for-loop in Pipeline.call so that when no
operations exist the original inputs are returned; update references to
self._callable_layers and the call method accordingly.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 0d321eed-f81b-415d-afa8-10acc2699062

📥 Commits

Reviewing files that changed from the base of the PR and between 9aaa0c3 and 6a4b9c0.

📒 Files selected for processing (3)

docs/source/notebooks/data/zea_data_example.ipynb
zea/data/file.py
zea/ops/pipeline.py

docs/source/notebooks/data/zea_data_example.ipynb

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

docs/source/notebooks/data/zea_data_example.ipynb (1)

383-387: Consider tightening the TensorFlow note wording for Colab vs local runs.

The note is directionally correct; a small wording tweak would make environment expectations clearer and avoid ambiguity.

Suggested wording refinement

- "In machine and deep learning workflows, we often want more features like batching, shuffling, and parallel data loading. The `zea.backend.tensorflow.make_dataloader` function provides a convenient way to create a TensorFlow data loader from a zea  dataset. "
+ "In machine and deep learning workflows, we often want features like batching, shuffling, and parallel data loading. The `zea.backend.tensorflow.make_dataloader` function provides a convenient way to create a TensorFlow data loader from a zea dataset."

- "🚨 Note! This does require a working TensorFlow installation, but does work in combination with any other backend as well. We are [working on migrating](https://github.com/tue-bmd/zea/pull/256) to [Grain](https://github.com/google/grain), which will provide a backend-agnostic dataloader in the near future."
+ "🚨 Note: `make_dataloader` requires TensorFlow. In Colab, TensorFlow is available by default; for local environments, install TensorFlow first. It can still be used alongside other backends. We are [working on migrating](https://github.com/tue-bmd/zea/pull/256) to [Grain](https://github.com/google/grain), which will provide a backend-agnostic dataloader in the near future."

Based on learnings: In zea documentation notebooks intended for Colab, do not include a line like %pip install tensorflow since Colab already ships TensorFlow; prefer a brief note (optionally guarded import) instead.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@docs/source/notebooks/data/zea_data_example.ipynb` around lines 383 - 387,
The TensorFlow environment note around the
zea.backend.tensorflow.make_dataloader description is ambiguous for Colab vs
local runs; update the notebook text to a concise guarded-note: remove any
suggestion to run "%pip install tensorflow", instead state that Colab already
includes TensorFlow but local setups may need installation, and optionally show
a short guarded-import snippet (try/except ImportError with pip install only in
the except) as an alternative; edit the paragraph referencing make_dataloader to
use this tightened wording so readers know when installation is required without
prescribing unnecessary Colab installs.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/source/notebooks/data/zea_data_example.ipynb`:
- Line 12: The bullet text "3. Loading data in batches with dataloading
utilities with `zea.backend.tensorflow.make_dataloader` This does require a
working TensorFlow installation!" is a run-on with a duplicated "with" and
missing punctuation; update the string so it reads clearly (for example: "3.
Loading data in batches using the dataloader utility
`zea.backend.tensorflow.make_dataloader`. This requires a working TensorFlow
installation."). Locate and replace the sentence in the loading-options bullet
in the notebook cell that contains `zea.backend.tensorflow.make_dataloader`.

---

Nitpick comments:
In `@docs/source/notebooks/data/zea_data_example.ipynb`:
- Around line 383-387: The TensorFlow environment note around the
zea.backend.tensorflow.make_dataloader description is ambiguous for Colab vs
local runs; update the notebook text to a concise guarded-note: remove any
suggestion to run "%pip install tensorflow", instead state that Colab already
includes TensorFlow but local setups may need installation, and optionally show
a short guarded-import snippet (try/except ImportError with pip install only in
the except) as an alternative; edit the paragraph referencing make_dataloader to
use this tightened wording so readers know when installation is required without
prescribing unnecessary Colab installs.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 437080a4-76ea-42ab-b429-6e35ba3b1031

📥 Commits

Reviewing files that changed from the base of the PR and between 6a4b9c0 and 476d651.

📒 Files selected for processing (2)

docs/source/notebooks/data/zea_data_example.ipynb
zea/ops/pipeline.py

🚧 Files skipped from review as they are similar to previous changes (1)

zea/ops/pipeline.py

docs/source/notebooks/data/zea_data_example.ipynb

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/source/notebooks/data/zea_data_example.ipynb`:
- Around line 383-387: Edit the dataloader documentation paragraph (around the
zea.backend.tensorflow.make_dataloader mention) to fix the double-space and
rephrase the warning for clarity: change "🚨 Note! This does require a working
TensorFlow installation, but does work in combination with any other backend as
well. We are [working on migrating]..." to a concise sentence like "Note: a
working TensorFlow installation is required, though the dataloader can be used
alongside other backends; we are migrating to Grain for a backend-agnostic
dataloader (see PR link)." Ensure the double space is removed and the text
references zea.backend.tensorflow.make_dataloader and the Grain migration link.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: abd8edaf-8ad8-42ae-8ea8-cd91c7eac6f5

📥 Commits

Reviewing files that changed from the base of the PR and between 476d651 and 3fae77c.

📒 Files selected for processing (1)

docs/source/notebooks/data/zea_data_example.ipynb

coderabbitai · 2026-03-13T14:45:34Z

docs/source/notebooks/data/zea_data_example.ipynb

+    "In machine and deep learning workflows, we often want more features like batching, shuffling, and parallel data loading. The `zea.backend.tensorflow.make_dataloader` function provides a convenient way to create a TensorFlow data loader from a zea  dataset. \n",
+    "\n",
+    "🚨 Note! This does require a working TensorFlow installation, but does work in combination with any other backend as well. We are [working on migrating](https://github.com/tue-bmd/zea/pull/256) to [Grain](https://github.com/google/grain), which will provide a backend-agnostic dataloader in the near future.\n",
+    "\n",
+    "This dataloader is particularly useful for training models. It is important that there is some consistency in the dataset, which is not the case for [PICMUS](https://www.creatis.insa-lyon.fr/Challenge/IEEE_IUS_2016/home). Therefore in this example we will use a small part of the [CAMUS](https://www.creatis.insa-lyon.fr/Challenge/camus/) dataset."


⚠️ Potential issue | 🟡 Minor

Polish wording in the dataloader section for readability.

There’s a small double-space typo and slightly awkward phrasing in the warning line.

✏️ Suggested doc text cleanup

- In machine and deep learning workflows, we often want more features like batching, shuffling, and parallel data loading. The `zea.backend.tensorflow.make_dataloader` function provides a convenient way to create a TensorFlow data loader from a zea dataset. + In machine and deep learning workflows, we often want more features like batching, shuffling, and parallel data loading. The `zea.backend.tensorflow.make_dataloader` function provides a convenient way to create a TensorFlow data loader from a zea dataset. - 🚨 Note! This does require a working TensorFlow installation, but does work in combination with any other backend as well. We are [working on migrating](https://github.com/tue-bmd/zea/pull/256) to [Grain](https://github.com/google/grain), which will provide a backend-agnostic dataloader in the near future. + 🚨 Note: This requires a working TensorFlow installation, but it can be used in combination with other backends. We are [working on migrating](https://github.com/tue-bmd/zea/pull/256) to [Grain](https://github.com/google/grain), which will provide a backend-agnostic dataloader in the near future.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@docs/source/notebooks/data/zea_data_example.ipynb` around lines 383 - 387, Edit the dataloader documentation paragraph (around the zea.backend.tensorflow.make_dataloader mention) to fix the double-space and rephrase the warning for clarity: change "🚨 Note! This does require a working TensorFlow installation, but does work in combination with any other backend as well. We are [working on migrating]..." to a concise sentence like "Note: a working TensorFlow installation is required, though the dataloader can be used alongside other backends; we are migrating to Grain for a backend-agnostic dataloader (see PR link)." Ensure the double space is removed and the text references zea.backend.tensorflow.make_dataloader and the Grain migration link.

tristan-deep added 2 commits March 9, 2026 16:22

materialize jax computation notebook before last cell

19d9e4e

TF install note in data example notebook

ac20d22

tristan-deep added the documentation Improvements or additions to documentation label Mar 9, 2026

tristan-deep marked this pull request as draft March 9, 2026 16:38

fix type annotations (partly)

6a4b9c0

tristan-deep marked this pull request as ready for review March 10, 2026 11:48

coderabbitai bot reviewed Mar 10, 2026

View reviewed changes

docs/source/notebooks/data/zea_data_example.ipynb Show resolved Hide resolved

tristan-deep and others added 2 commits March 10, 2026 16:51

Merge branch 'main' into feature/review-tomelse

d5afdd3

Merge remote-tracking branch 'origin/main' into feature/review-tomelse

476d651

coderabbitai bot reviewed Mar 13, 2026

View reviewed changes

docs/source/notebooks/data/zea_data_example.ipynb Outdated Show resolved Hide resolved

tristan-deep and others added 2 commits March 13, 2026 14:32

typos

b63d363

Merge branch 'main' into feature/review-tomelse

3fae77c

coderabbitai bot reviewed Mar 13, 2026

View reviewed changes

Merge branch 'main' into feature/review-tomelse

2e5f99c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JOSS review @tomelse#283

JOSS review @tomelse#283
tristan-deep wants to merge 8 commits intomainfrom
feature/review-tomelse

tristan-deep commented Mar 9, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Mar 9, 2026 •

edited

Loading

❌ Failed checks (1 inconclusive)

Uh oh!

codecov bot commented Mar 9, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Mar 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

tristan-deep commented Mar 9, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Suggested reviewers

❌ Failed checks (1 inconclusive)

Uh oh!

codecov bot commented Mar 9, 2026

Codecov Report

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

tristan-deep commented Mar 9, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 9, 2026 •

edited

Loading