Conversation
… single-channel cytosol masks
…move list materialization)
There was a problem hiding this comment.
Pull request overview
This PR improves the segmentation workflow configuration and sharding mechanics, aiming to make segmentation more flexible (custom channel selection) and safer (replace eval/CSV sharding metadata with JSON), while also improving multiprocessing throughput and updating related documentation.
Changes:
- Add support for specifying nucleus/cytosol segmentation channels as either an
intor a list of indices in_BaseSegmentation. - Replace shard window + sharding plan persistence from
*.csv+eval(...)to JSON serialization/deserialization. - Switch multiprocessing execution to
imap_unorderedfor segmentation and extraction, and update docs/examples accordingly.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
src/scportrait/pipeline/segmentation/workflows/_base_segmentation_workflow.py |
Adds flexible channel selection inputs and updates how channels are extracted for nucleus/cytosol segmentation. |
src/scportrait/pipeline/segmentation/segmentation.py |
Introduces JSON sharding plan/window persistence, adjusts sharding plan computation, and changes multiprocessing mapping strategy. |
src/scportrait/pipeline/extraction.py |
Uses imap_unordered to stream extraction results instead of materializing them in memory. |
docs/pages/workflow/segmentation_workflow.rst |
Fixes YAML examples and documents new channel-selection config keys and sharded workflow examples. |
docs/pages/workflow/config.rst |
Expands config documentation introduction and provides a clearer example. |
Comments suppressed due to low confidence (3)
src/scportrait/pipeline/segmentation/workflows/_base_segmentation_workflow.py:138
input_imagehas already been subset toself.segmentation_channelsbySegmentation._select_relevant_channels(). Indexing it withself.nucleus_segmentation_channel/self.cytosol_segmentation_channel(which currently hold original channel IDs from the full image) will raise out-of-bounds errors or select the wrong channel order when users setsegmentation_channel_nuclei/segmentation_channel_cytosol. These channel selectors need to be remapped to indices within the selected subset (similar to howcombine_*_channelsare remapped) before being used here.
else:
nucleus_channel = input_image[self.nucleus_segmentation_channel]
if len(nucleus_channel.shape) == 2:
nucleus_channel = nucleus_channel[np.newaxis, ...]
if len(nucleus_channel.shape) == 4:
nucleus_channel = nucleus_channel.squeeze()
src/scportrait/pipeline/segmentation/workflows/_base_segmentation_workflow.py:94
self.segmentation_channels = list(set(self.segmentation_channels))does not “sort according to order” and produces a non-deterministic channel ordering. Since later logic relies on consistent channel positions after subsetting (e.g., mapping nucleus/cytosol channels within the subset), this can cause incorrect channel selection across runs. Use a deterministic de-duplication strategy that preserves order (or explicitly sort).
# remove any duplicate entries and sort according to order
self.segmentation_channels = list(set(self.segmentation_channels))
src/scportrait/pipeline/segmentation/segmentation.py:827
- Calling
Warning("Sharding plans do not match.")does not emit a warning (it just creates an exception object). Usewarnings.warn(...)to actually surface the mismatch, or raise an exception if continuing would be unsafe.
if window_local != window:
Warning("Sharding plans do not match.")
self.log("Sharding plans do not match.")
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
src/scportrait/pipeline/segmentation/workflows/_base_segmentation_workflow.py
Show resolved
Hide resolved
src/scportrait/pipeline/segmentation/workflows/_base_segmentation_workflow.py
Outdated
Show resolved
Hide resolved
…nnLabs/scPortrait into improve_segmentation_workflow
for more information, see https://pre-commit.ci
fixes #323