Conversation
WalkthroughAdded return type annotations to Pipeline and data-file public methods; updated the zea_data_example notebook to clarify TensorFlow dependency, add Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested reviewers
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
zea/ops/pipeline.py (1)
297-317:⚠️ Potential issue | 🟠 MajorHandle the empty-pipeline case before returning
outputs.
outputsis only assigned inside the loop, soPipeline([])or a config withoperations: []still crashes here withUnboundLocalError. Either reject empty pipelines in__init__, or return the inputs unchanged when there are no callable layers.🩹 Local fix
def call(self, **inputs) -> Dict[str, Any]: """Process input data through the pipeline.""" + if not self._callable_layers: + return inputs + for operation in self._callable_layers: try: outputs = operation(**inputs)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@zea/ops/pipeline.py` around lines 297 - 317, The call method can raise UnboundLocalError when self._callable_layers is empty because outputs is only set inside the loop; fix by handling the empty-pipeline case up front or ensuring outputs is initialized: either check if not self._callable_layers and return inputs immediately, or initialize outputs = inputs before the for-loop in Pipeline.call so that when no operations exist the original inputs are returned; update references to self._callable_layers and the call method accordingly.zea/data/file.py (1)
171-226:⚠️ Potential issue | 🟡 MinorBroaden the return type or disallow scalar indexing.
Line 175 declares
-> np.ndarray, butdata[indices]can return a NumPy scalar when every axis is indexed by an integer. Since the signature accepts arbitrary integer tuples, the type contract is narrower than the actual runtime behavior. Either restrictindicesto array-producing cases, or widen the return type to includenumpy.generic:✏️ Typing fix
- ) -> np.ndarray: + ) -> np.ndarray | np.generic:Mirror the same widened type in
load_file(...)for its first tuple element.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@zea/data/file.py` around lines 171 - 226, The return annotation of load_data currently declares -> np.ndarray but indexing with indices (the indices parameter of load_data) can yield a NumPy scalar (numpy.generic) when all axes are integer-indexed; update the type signature to reflect this by widening the return type to Union[np.ndarray, numpy.generic] (or np.ndarray | numpy.generic) and adjust any imports/typing aliases as needed; additionally mirror this widened first-tuple-element return type in the related load_file function so its corresponding returned element also allows numpy.generic.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/source/notebooks/data/zea_data_example.ipynb`:
- Around line 383-387: The notebook documents that
zea.backend.tensorflow.make_dataloader requires TensorFlow but doesn't install
it; add a new notebook cell immediately before the dataloader import/usage that
installs TensorFlow (run pip install tensorflow) and capture its output so the
subsequent import of zea.backend.tensorflow and the call to make_dataloader
succeed in a fresh Colab environment.
---
Outside diff comments:
In `@zea/data/file.py`:
- Around line 171-226: The return annotation of load_data currently declares ->
np.ndarray but indexing with indices (the indices parameter of load_data) can
yield a NumPy scalar (numpy.generic) when all axes are integer-indexed; update
the type signature to reflect this by widening the return type to
Union[np.ndarray, numpy.generic] (or np.ndarray | numpy.generic) and adjust any
imports/typing aliases as needed; additionally mirror this widened
first-tuple-element return type in the related load_file function so its
corresponding returned element also allows numpy.generic.
In `@zea/ops/pipeline.py`:
- Around line 297-317: The call method can raise UnboundLocalError when
self._callable_layers is empty because outputs is only set inside the loop; fix
by handling the empty-pipeline case up front or ensuring outputs is initialized:
either check if not self._callable_layers and return inputs immediately, or
initialize outputs = inputs before the for-loop in Pipeline.call so that when no
operations exist the original inputs are returned; update references to
self._callable_layers and the call method accordingly.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 0d321eed-f81b-415d-afa8-10acc2699062
📒 Files selected for processing (3)
docs/source/notebooks/data/zea_data_example.ipynbzea/data/file.pyzea/ops/pipeline.py
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
docs/source/notebooks/data/zea_data_example.ipynb (1)
383-387: Consider tightening the TensorFlow note wording for Colab vs local runs.The note is directionally correct; a small wording tweak would make environment expectations clearer and avoid ambiguity.
Suggested wording refinement
- "In machine and deep learning workflows, we often want more features like batching, shuffling, and parallel data loading. The `zea.backend.tensorflow.make_dataloader` function provides a convenient way to create a TensorFlow data loader from a zea dataset. " + "In machine and deep learning workflows, we often want features like batching, shuffling, and parallel data loading. The `zea.backend.tensorflow.make_dataloader` function provides a convenient way to create a TensorFlow data loader from a zea dataset." - "🚨 Note! This does require a working TensorFlow installation, but does work in combination with any other backend as well. We are [working on migrating](https://github.com/tue-bmd/zea/pull/256) to [Grain](https://github.com/google/grain), which will provide a backend-agnostic dataloader in the near future." + "🚨 Note: `make_dataloader` requires TensorFlow. In Colab, TensorFlow is available by default; for local environments, install TensorFlow first. It can still be used alongside other backends. We are [working on migrating](https://github.com/tue-bmd/zea/pull/256) to [Grain](https://github.com/google/grain), which will provide a backend-agnostic dataloader in the near future."Based on learnings: In zea documentation notebooks intended for Colab, do not include a line like %pip install tensorflow since Colab already ships TensorFlow; prefer a brief note (optionally guarded import) instead.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/source/notebooks/data/zea_data_example.ipynb` around lines 383 - 387, The TensorFlow environment note around the zea.backend.tensorflow.make_dataloader description is ambiguous for Colab vs local runs; update the notebook text to a concise guarded-note: remove any suggestion to run "%pip install tensorflow", instead state that Colab already includes TensorFlow but local setups may need installation, and optionally show a short guarded-import snippet (try/except ImportError with pip install only in the except) as an alternative; edit the paragraph referencing make_dataloader to use this tightened wording so readers know when installation is required without prescribing unnecessary Colab installs.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/source/notebooks/data/zea_data_example.ipynb`:
- Line 12: The bullet text "3. Loading data in batches with dataloading
utilities with `zea.backend.tensorflow.make_dataloader` This does require a
working TensorFlow installation!" is a run-on with a duplicated "with" and
missing punctuation; update the string so it reads clearly (for example: "3.
Loading data in batches using the dataloader utility
`zea.backend.tensorflow.make_dataloader`. This requires a working TensorFlow
installation."). Locate and replace the sentence in the loading-options bullet
in the notebook cell that contains `zea.backend.tensorflow.make_dataloader`.
---
Nitpick comments:
In `@docs/source/notebooks/data/zea_data_example.ipynb`:
- Around line 383-387: The TensorFlow environment note around the
zea.backend.tensorflow.make_dataloader description is ambiguous for Colab vs
local runs; update the notebook text to a concise guarded-note: remove any
suggestion to run "%pip install tensorflow", instead state that Colab already
includes TensorFlow but local setups may need installation, and optionally show
a short guarded-import snippet (try/except ImportError with pip install only in
the except) as an alternative; edit the paragraph referencing make_dataloader to
use this tightened wording so readers know when installation is required without
prescribing unnecessary Colab installs.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 437080a4-76ea-42ab-b429-6e35ba3b1031
📒 Files selected for processing (2)
docs/source/notebooks/data/zea_data_example.ipynbzea/ops/pipeline.py
🚧 Files skipped from review as they are similar to previous changes (1)
- zea/ops/pipeline.py
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/source/notebooks/data/zea_data_example.ipynb`:
- Around line 383-387: Edit the dataloader documentation paragraph (around the
zea.backend.tensorflow.make_dataloader mention) to fix the double-space and
rephrase the warning for clarity: change "🚨 Note! This does require a working
TensorFlow installation, but does work in combination with any other backend as
well. We are [working on migrating]..." to a concise sentence like "Note: a
working TensorFlow installation is required, though the dataloader can be used
alongside other backends; we are migrating to Grain for a backend-agnostic
dataloader (see PR link)." Ensure the double space is removed and the text
references zea.backend.tensorflow.make_dataloader and the Grain migration link.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: abd8edaf-8ad8-42ae-8ea8-cd91c7eac6f5
📒 Files selected for processing (1)
docs/source/notebooks/data/zea_data_example.ipynb
| "In machine and deep learning workflows, we often want more features like batching, shuffling, and parallel data loading. The `zea.backend.tensorflow.make_dataloader` function provides a convenient way to create a TensorFlow data loader from a zea dataset. \n", | ||
| "\n", | ||
| "🚨 Note! This does require a working TensorFlow installation, but does work in combination with any other backend as well. We are [working on migrating](https://github.com/tue-bmd/zea/pull/256) to [Grain](https://github.com/google/grain), which will provide a backend-agnostic dataloader in the near future.\n", | ||
| "\n", | ||
| "This dataloader is particularly useful for training models. It is important that there is some consistency in the dataset, which is not the case for [PICMUS](https://www.creatis.insa-lyon.fr/Challenge/IEEE_IUS_2016/home). Therefore in this example we will use a small part of the [CAMUS](https://www.creatis.insa-lyon.fr/Challenge/camus/) dataset." |
There was a problem hiding this comment.
Polish wording in the dataloader section for readability.
There’s a small double-space typo and slightly awkward phrasing in the warning line.
✏️ Suggested doc text cleanup
- In machine and deep learning workflows, we often want more features like batching, shuffling, and parallel data loading. The `zea.backend.tensorflow.make_dataloader` function provides a convenient way to create a TensorFlow data loader from a zea dataset.
+ In machine and deep learning workflows, we often want more features like batching, shuffling, and parallel data loading. The `zea.backend.tensorflow.make_dataloader` function provides a convenient way to create a TensorFlow data loader from a zea dataset.
- 🚨 Note! This does require a working TensorFlow installation, but does work in combination with any other backend as well. We are [working on migrating](https://github.com/tue-bmd/zea/pull/256) to [Grain](https://github.com/google/grain), which will provide a backend-agnostic dataloader in the near future.
+ 🚨 Note: This requires a working TensorFlow installation, but it can be used in combination with other backends. We are [working on migrating](https://github.com/tue-bmd/zea/pull/256) to [Grain](https://github.com/google/grain), which will provide a backend-agnostic dataloader in the near future.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/source/notebooks/data/zea_data_example.ipynb` around lines 383 - 387,
Edit the dataloader documentation paragraph (around the
zea.backend.tensorflow.make_dataloader mention) to fix the double-space and
rephrase the warning for clarity: change "🚨 Note! This does require a working
TensorFlow installation, but does work in combination with any other backend as
well. We are [working on migrating]..." to a concise sentence like "Note: a
working TensorFlow installation is required, though the dataloader can be used
alongside other backends; we are migrating to Grain for a backend-agnostic
dataloader (see PR link)." Ensure the double space is removed and the text
references zea.backend.tensorflow.make_dataloader and the Grain migration link.
This PR addresses the JOSS review comments and feedback by @tomelse. See a summary of the review here: openjournals/joss-reviews#9881 (comment)
And the related issues:
Please find an updated version of the docs right here: https://zea--283.org.readthedocs.build/en/283/
Summary by CodeRabbit