Skip to content

feat(examples): add segmentation & feature extraction examples#33

Merged
alxndrkalinin merged 36 commits intomainfrom
feat/segmentation-3d-monolayer-notebook
Mar 12, 2026
Merged

feat(examples): add segmentation & feature extraction examples#33
alxndrkalinin merged 36 commits intomainfrom
feat/segmentation-3d-monolayer-notebook

Conversation

@alxndrkalinin
Copy link
Owner

@alxndrkalinin alxndrkalinin commented Mar 11, 2026

Summary

  • Add example notebook reproducing the CellProfiler 3D monolayer segmentation tutorial using cubic's device-agnostic API
  • Add example notebook for GPU-accelerated 3D feature extraction with regionprops_table
  • Enhance segment_utils.py with CellProfiler-aligned parameters (dilate_seeds, mask, filter_mode, downscale_xy_only)
  • Add CI workflow to auto-convert notebooks to scripts on push to main
  • Fix cleanup_segmentation uint8 truncation (was silently corrupting labels >255)
  • Fix cucim spacing bug in regionprops/regionprops_table (list→tuple conversion)

The same code runs on both CPU and GPU without modification.

Segmentation results

Metric CellProfiler cubic
Nuclei count 25 26
Cell count 24 22
Nuclei mAP (IoU 0.5-1.0) -- 0.709
Cells mAP (IoU 0.5-1.0) -- 0.588

CellProfiler alignment

Pipeline parameters cross-referenced against CellProfiler's source code:

  • Cubic median filter with mode="constant" (CP's MedianFilter)
  • Watershed seed dilation with ball(1) (CP's Watershed)
  • Plane-by-plane disk(17) closing (CP's Closing — 2D per Z-slice)
  • Multi-Otsu with nbins=128 (CP default)
  • RemoveHoles diameter-to-volume conversion (size=20 -> area_threshold=4189)
  • Seed erosion at 0.5x with vanished-object protection (CP's ErodeObjects)
  • EDT-based watershed landscape (CP's shape declumping)
  • Per-nucleus hole filling after upscale

Files changed

File Change
examples/notebooks/segmentation_3d_monolayer.ipynb New: 3D segmentation notebook
examples/notebooks/feature_extraction_3d.ipynb New: 3D feature extraction notebook
cubic/segmentation/segment_utils.py Enhanced: segment_watershed (dilate_seeds, mask), downscale_and_filter (filter_mode, downscale_xy_only), cleanup_segmentation (uint8->uint16)
cubic/feature/voxel.py Fix: spacing list->tuple for cucim compat
.github/workflows/notebooks.yml New: auto-convert notebooks to scripts
pyproject.toml New [examples] extra (jupyter, pooch)
README.md, examples/data/README.md Add notebook links and data docs

Data

5 files uploaded as GitHub release assets on v0.7.0a1 (3 raw channels + 2 CellProfiler reference labels), auto-downloaded via pooch on first run.

Test plan

  • Segmentation notebook executes on CPU and GPU with identical results
  • Feature extraction notebook executes on GPU (18 objects, 136 features)
  • 108 pytest tests pass (1 pre-existing cellpose skip)
  • ruff check and ruff format --check pass
  • All Copilot review comments addressed
  • All imports use cubic.* wrappers

🤖 Generated with Claude Code

alxndrkalinin and others added 20 commits March 10, 2026 17:26
Add notebook reproducing CellProfiler 3D monolayer segmentation
(BBBC034v1, Thirstrup et al. 2018) using cubic. Includes nuclei
and cell segmentation pipelines with AP evaluation against
CellProfiler reference labels.

Data auto-downloaded via pooch from GitHub release v0.7.0a1 assets.

Results: 25 nuclei (mAP 0.433), 24 cells (mAP 0.282).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…tation

Cross-referenced CellProfiler source code to identify and fix 4 discrepancies
that improve segmentation quality:

1. Median filter: use cubic window (scipy median_filter size=5) instead of
   spherical ball(5) footprint — matches CP's MedianFilter module
2. Monolayer closing: use plane-by-plane disk(17) instead of 3D ball(17) —
   matches CP's Closing module which applies 2D structuring elements per-slice
3. Multi-Otsu nbins: pass nbins=128 to match CellProfiler's default
4. Seed creation: erode downsized nuclei (ball(5) at 0.5x) with
   vanished-object protection — matches CP's ErodeObjects module

Results improved from baseline:
- Nuclei: 29 objects, mAP 0.468 (was 0.433)
- Cells: 25 objects, mAP 0.490 (was 0.282)

Discrepancies evaluated but not applied (hurt metrics):
- Cell watershed -EDT landscape (cells mAP -0.023)
- Cube footprint for nuclei watershed (nuc mAP -0.098)
- Nearest-neighbor downscale (hurt combined metrics)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…lProfiler

Switch cell watershed from membrane intensity landscape to negated distance
transform of binary cell mask, matching CellProfiler's shape-based declumping.

This fix was re-evaluated on top of the 4 previous CP-alignment changes and
now improves cells mAP (+0.016) without affecting nuclei.

Results: nuclei mAP 0.468, cells mAP 0.506 (was 0.490).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ummary

Replace single AP plot with side-by-side comparison showing old (cubic_paper)
vs new AP curves for both nuclei and cells. Update summary table with
concrete numbers for CellProfiler, old pipeline, and new pipeline.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…rsegmentation

Add binary_dilation(seeds, ball(1)) before labeling in nuclei watershed,
matching CellProfiler's Watershed module which dilates seeds to merge nearby
peaks before assigning labels.

This reduces nuclei oversegmentation (29 -> 27, closer to CP's 25) while
significantly improving mAP:
- Nuclei: 27 objects, mAP 0.561 (was 29 objects, mAP 0.468)
- Cells:  23 objects, mAP 0.558 (was 25 objects, mAP 0.506)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ching CP

CellProfiler's MedianFilter passes mode="constant" (zero-padding) to
scipy.ndimage.median_filter, while we were using the default "reflect".
Zero-padding darkens border pixels, making them fall below the Otsu
threshold and producing a cleaner binary mask with fewer edge artifacts.

Results: nuclei 26 (was 27), mAP 0.698 (was 0.561).
         cells 22, mAP 0.579 (was 0.558).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…s per object

The old ListedColormap approach mapped labels 1-26 to the first ~10% of
tab20's range, causing most labels to share the same 2-3 colors. Replace
with a label_cmap() function that maps each label to a distinct tab20
color using modular arithmetic.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…mbrane mask

CellProfiler's RemoveHoles(size=20) interprets 20 as diameter, converting
to sphere volume: pi * (4/3) * 10^3 = 4189 voxels. We were using
area_threshold=20 (only 20 voxels), leaving membrane fragments inside cell
interiors unfilled. This produced a noisier cell mask with more internal
gaps, leading to fragmented watershed results.

Results: cells mAP 0.588 (was 0.579), nuclei unchanged at 0.698/26.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…isualization

rescale_xy only downscales XY (Z unchanged), so the downscaled binary and
watershed images should use z_mid (not z_mid//2) for the mid-slice.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…eferences

- Remove old/new AP comparison, replace with single clean AP curve plot
- Merge nuclei and cells slice comparisons into one 4-row figure
- Remove cubic_paper/old pipeline references from summary table
- Simplify summary to compare only cubic vs CellProfiler

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Remove unused ListedColormap import
- Replace direct skimage imports (peak_local_max, watershed, resize) with
  cubic.skimage equivalents (feature, segmentation, transform)
- Remove duplicate import of resize in cell 8
- All skimage functions now imported through cubic's device-agnostic proxy

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
After watershed + upscaling + cleanup, some nuclei have small internal
holes (up to 340 voxels) visible as gaps in the XY mid-slice. Add
per-nucleus remove_small_holes(area_threshold=500) to fill these.

Nuclei mAP: 0.709 (was 0.698), AP@IoU=1.0: 0.378 (was 0.308).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Remove standalone AP plot cell and add AP curves as 4th column in the
combined nuclei/cells comparison figure. Each AP curve sits next to its
corresponding XY/XZ mask comparisons.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ppers

- Replace scipy.ndimage.median_filter with cubic.scipy.ndimage.median_filter
- Replace scipy.ndimage.distance_transform_edt with cubic.image_utils.distance_transform_edt
- No direct scipy imports remain — all go through cubic's device-agnostic proxies

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… calls

Let cubic's device-agnostic proxies handle CPU/GPU routing instead of
manually converting arrays. Removed:
- asnumpy + to_device around ndimage.median_filter (proxy handles it)
- asnumpy(nuc_binary) before distance_transform_edt/peak_local_max
- to_device after segmentation.watershed (proxy returns on correct device)
- asnumpy(nuclei) before transform.resize (proxy handles it)
- asnumpy(cell_mask/seeds) before watershed (proxy handles it)
- asnumpy in hole-fill loop (boolean indexing works on both devices)

Kept: asnumpy(coords) for np.zeros seed creation (needs CPU fancy indexing),
asnumpy in planewise closing loop (np.zeros_like needs CPU), asnumpy for
matplotlib visualization, asnumpy(old_labels) for Python enumerate loop.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…portability

Reframe the notebook intro to highlight cubic's device-agnostic API as the
main point, with the CellProfiler reproduction as the demonstration use case.
Add links to the CellProfiler tutorial website and GitHub repository.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ootprints

- watershed is not in cucim — explicitly asnumpy inputs before calling
  segmentation.watershed, then to_device the result back
- morphology.ball() returns CPU arrays — use to_same_device for footprints
  passed to cucim morphology operations (binary_dilation, peak_local_max)
- Add get_device import for preserving device across CPU fallbacks

Verified: runs on both CPU and GPU with identical results.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…rence

The tutorial data (60x256x256, 3 channels: membrane/mito/DNA) does not
match BBBC034 (1024x1024x52, 4 channels: CellMask/GFP/DNA/brightfield).
The data is from the Allen Institute for Cell Science, provided with
the CellProfiler 3D monolayer tutorial.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry @alxndrkalinin, your pull request is larger than the review limit of 150000 diff characters

@alxndrkalinin alxndrkalinin changed the title Feat/segmentation 3d monolayer notebook feat(examples): add 3D monolayer segmentation notebook Mar 11, 2026
@alxndrkalinin alxndrkalinin requested a review from Copilot March 11, 2026 21:52
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds documentation entries for a new example notebook that reproduces CellProfiler’s 3D monolayer segmentation tutorial using cubic, including dataset notes and a link from the main examples table.

Changes:

  • Documented the 3D monolayer dataset files and provenance in examples/data/README.md
  • Added the new 3D monolayer segmentation notebook to the main README.md examples table

Reviewed changes

Copilot reviewed 2 out of 3 changed files in this pull request and generated 2 comments.

File Description
examples/data/README.md Adds a new dataset section describing the 3D monolayer TIFF files and their source/download location
README.md Adds the new segmentation notebook link to the examples table

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

alxndrkalinin and others added 5 commits March 11, 2026 15:03
…comment

- Move perf_counter and get_device imports to cell 1 with other imports
- Add comment explaining why planewise closing requires CPU roundtrip

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Benchmark with proper warmup: nuclei 0.2s GPU vs 5.4s CPU (30x speedup),
cells ~3.2s on both (watershed/closing CPU-bound), total 3.3s vs 9.0s (2.7x).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Watershed is not in cuCIM; planewise closing runs on CPU by design to
match CellProfiler's per-slice 2D behavior, not because of a fallback.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Keep mono_ds on GPU and call morphology.closing per Z-slice with a GPU
disk footprint via to_same_device, instead of asnumpy → CPU loop → to_device.

Cell segmentation: 1.86s (was 3.49s on GPU, ~1.9x faster).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add dilate_seeds option that dilates seed points with ball(1) before
labeling, matching CellProfiler's watershed seed dilation behavior.
Also fix GPU compatibility: use to_same_device for footprints and
asnumpy for watershed inputs (not in cucim).

Simplifies notebook nuclei watershed from 12 lines of inline code to:
  segment_watershed(nuc_binary, ball_size=10, dilate_seeds=True)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
alxndrkalinin and others added 7 commits March 11, 2026 15:37
… rescale_xy

- Add filter_mode parameter for boundary handling (default "nearest");
  when filter_shape="square", uses scipy.ndimage.median_filter with the
  specified mode instead of skimage.filters.median
- Fix: use rescale_xy instead of transform.rescale to only downscale XY
  (preserving Z dimension for 3D images)
- Notebook nuclei step 2 simplified to:
    downscale_and_filter(dna_norm, filter_size=5, filter_mode="constant")

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Allow choosing between XY-only downscaling (default, preserves Z) and
uniform downscaling of all dimensions (downscale_xy_only=False).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… pipeline

When markers and mask are both provided, segment_watershed now computes
the negated EDT of the mask as the watershed landscape (shape-based
partitioning), matching CellProfiler's declumping behavior.

Notebook cell pipeline simplified from 6 lines of inline EDT+watershed to:
  segment_watershed(cell_mask, markers=seeds, mask=cell_mask)

Also removed unused cubic.skimage.segmentation import from notebook.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ze param

Replace manual per-label hole-fill loop with the existing max_hole_size
parameter of cleanup_segmentation:
  cleanup_segmentation(nuclei, min_obj_size=50, max_hole_size=500)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
cleanup_segmentation calls label() which splits disconnected pieces of
the same cell into separate objects (22 -> 29 cells, mAP 0.588 -> 0.465).
The manual relabeling preserves watershed label identity across
disconnected 3D pieces.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…rkflow

- Add examples/notebooks/feature_extraction_3d.ipynb: GPU-accelerated
  regionprops_table on cells3d (18 objects, 136 features)
- Add examples/scripts/generated CI workflow: auto-converts notebooks to
  scripts on push to main
- Add [examples] optional dependency group (jupyter, pandas, pooch)
- Fix cubic.feature.voxel: convert spacing list to tuple for cucim compat
- Remove redundant examples/scripts/regionprops_example.py

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace pandas DataFrame usage in feature_extraction_3d notebook with
plain numpy dict + formatted print. Remove pandas from [examples] extra.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 9 changed files in this pull request and generated 9 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

@alxndrkalinin
Copy link
Owner Author

Code review

No issues found. Checked for bugs and CLAUDE.md compliance.

🤖 Generated with Claude Code

- If this code review was useful, please react with 👍. Otherwise, react with 👎.

alxndrkalinin and others added 2 commits March 12, 2026 12:12
Code review fixes:
- Fix regionprops() missing spacing tuple conversion for cucim compat
- Remove unnecessary asnumpy(distance).shape — .shape works on both devices

Copilot feedback fixes:
- Make downscale_xy_only/filter_mode keyword-only in downscale_and_filter
  to preserve positional arg compatibility
- Make mask/dilate_seeds keyword-only in segment_watershed, keep ball_size
  as 3rd positional arg for backward compat
- Replace assert with ValueError for filter_shape validation
- Fix numpy scalar formatting in feature_extraction_3d notebook
  (np.issubdtype check instead of isinstance(x, float))
- Use uv sync + uv run in notebooks CI workflow (matches lint-format.yml)
- Remove dangling [tool.uv.sources] iohub entry from pyproject.toml

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
cleanup_segmentation was casting label() output to uint8, silently
truncating labels >255 to 0. Now returns the native int32/int64 dtype
from label(). Callers that need a specific dtype already cast explicitly.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@alxndrkalinin alxndrkalinin force-pushed the feat/segmentation-3d-monolayer-notebook branch from 844a117 to 1db8308 Compare March 12, 2026 20:15
alxndrkalinin and others added 2 commits March 12, 2026 13:18
…leanup_segmentation

cleanup_segmentation now returns uint16 natively.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@alxndrkalinin alxndrkalinin changed the title feat(examples): add 3D monolayer segmentation notebook feat(examples): add segmentation & feature extraction examples Mar 12, 2026
@alxndrkalinin alxndrkalinin merged commit aa41e7a into main Mar 12, 2026
9 checks passed
@alxndrkalinin alxndrkalinin deleted the feat/segmentation-3d-monolayer-notebook branch March 12, 2026 20:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants