Skip to content

Improve segmentation workflow#385

Open
sophiamaedler wants to merge 29 commits intomainfrom
improve_segmentation_workflow
Open

Improve segmentation workflow#385
sophiamaedler wants to merge 29 commits intomainfrom
improve_segmentation_workflow

Conversation

@sophiamaedler
Copy link
Collaborator

@sophiamaedler sophiamaedler commented Mar 14, 2026

fixes #323

Copilot AI review requested due to automatic review settings March 14, 2026 16:29
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves the segmentation workflow configuration and sharding mechanics, aiming to make segmentation more flexible (custom channel selection) and safer (replace eval/CSV sharding metadata with JSON), while also improving multiprocessing throughput and updating related documentation.

Changes:

  • Add support for specifying nucleus/cytosol segmentation channels as either an int or a list of indices in _BaseSegmentation.
  • Replace shard window + sharding plan persistence from *.csv + eval(...) to JSON serialization/deserialization.
  • Switch multiprocessing execution to imap_unordered for segmentation and extraction, and update docs/examples accordingly.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
src/scportrait/pipeline/segmentation/workflows/_base_segmentation_workflow.py Adds flexible channel selection inputs and updates how channels are extracted for nucleus/cytosol segmentation.
src/scportrait/pipeline/segmentation/segmentation.py Introduces JSON sharding plan/window persistence, adjusts sharding plan computation, and changes multiprocessing mapping strategy.
src/scportrait/pipeline/extraction.py Uses imap_unordered to stream extraction results instead of materializing them in memory.
docs/pages/workflow/segmentation_workflow.rst Fixes YAML examples and documents new channel-selection config keys and sharded workflow examples.
docs/pages/workflow/config.rst Expands config documentation introduction and provides a clearer example.
Comments suppressed due to low confidence (3)

src/scportrait/pipeline/segmentation/workflows/_base_segmentation_workflow.py:138

  • input_image has already been subset to self.segmentation_channels by Segmentation._select_relevant_channels(). Indexing it with self.nucleus_segmentation_channel / self.cytosol_segmentation_channel (which currently hold original channel IDs from the full image) will raise out-of-bounds errors or select the wrong channel order when users set segmentation_channel_nuclei / segmentation_channel_cytosol. These channel selectors need to be remapped to indices within the selected subset (similar to how combine_*_channels are remapped) before being used here.
            else:
                nucleus_channel = input_image[self.nucleus_segmentation_channel]
                if len(nucleus_channel.shape) == 2:
                    nucleus_channel = nucleus_channel[np.newaxis, ...]
                if len(nucleus_channel.shape) == 4:
                    nucleus_channel = nucleus_channel.squeeze()

src/scportrait/pipeline/segmentation/workflows/_base_segmentation_workflow.py:94

  • self.segmentation_channels = list(set(self.segmentation_channels)) does not “sort according to order” and produces a non-deterministic channel ordering. Since later logic relies on consistent channel positions after subsetting (e.g., mapping nucleus/cytosol channels within the subset), this can cause incorrect channel selection across runs. Use a deterministic de-duplication strategy that preserves order (or explicitly sort).
        # remove any duplicate entries and sort according to order
        self.segmentation_channels = list(set(self.segmentation_channels))

src/scportrait/pipeline/segmentation/segmentation.py:827

  • Calling Warning("Sharding plans do not match.") does not emit a warning (it just creates an exception object). Use warnings.warn(...) to actually surface the mismatch, or raise an exception if continuing would be unsafe.
            if window_local != window:
                Warning("Sharding plans do not match.")
                self.log("Sharding plans do not match.")

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

differences between segmentation report and cell extraction in scportrait 1.4.0.

2 participants