Commit aa41e7a
feat(examples): add segmentation & feature extraction examples (#33)
* feat(examples): add 3D monolayer segmentation notebook
Add notebook reproducing CellProfiler 3D monolayer segmentation
(BBBC034v1, Thirstrup et al. 2018) using cubic. Includes nuclei
and cell segmentation pipelines with AP evaluation against
CellProfiler reference labels.
Data auto-downloaded via pooch from GitHub release v0.7.0a1 assets.
Results: 25 nuclei (mAP 0.433), 24 cells (mAP 0.282).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(segmentation notebook): align pipeline with CellProfiler implementation
Cross-referenced CellProfiler source code to identify and fix 4 discrepancies
that improve segmentation quality:
1. Median filter: use cubic window (scipy median_filter size=5) instead of
spherical ball(5) footprint — matches CP's MedianFilter module
2. Monolayer closing: use plane-by-plane disk(17) instead of 3D ball(17) —
matches CP's Closing module which applies 2D structuring elements per-slice
3. Multi-Otsu nbins: pass nbins=128 to match CellProfiler's default
4. Seed creation: erode downsized nuclei (ball(5) at 0.5x) with
vanished-object protection — matches CP's ErodeObjects module
Results improved from baseline:
- Nuclei: 29 objects, mAP 0.468 (was 0.433)
- Cells: 25 objects, mAP 0.490 (was 0.282)
Discrepancies evaluated but not applied (hurt metrics):
- Cell watershed -EDT landscape (cells mAP -0.023)
- Cube footprint for nuclei watershed (nuc mAP -0.098)
- Nearest-neighbor downscale (hurt combined metrics)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(segmentation notebook): use EDT-based cell watershed matching CellProfiler
Switch cell watershed from membrane intensity landscape to negated distance
transform of binary cell mask, matching CellProfiler's shape-based declumping.
This fix was re-evaluated on top of the 4 previous CP-alignment changes and
now improves cells mAP (+0.016) without affecting nuclei.
Results: nuclei mAP 0.468, cells mAP 0.506 (was 0.490).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs(segmentation notebook): add old vs new AP comparison plots and summary
Replace single AP plot with side-by-side comparison showing old (cubic_paper)
vs new AP curves for both nuclei and cells. Update summary table with
concrete numbers for CellProfiler, old pipeline, and new pipeline.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(segmentation notebook): add watershed seed dilation to reduce oversegmentation
Add binary_dilation(seeds, ball(1)) before labeling in nuclei watershed,
matching CellProfiler's Watershed module which dilates seeds to merge nearby
peaks before assigning labels.
This reduces nuclei oversegmentation (29 -> 27, closer to CP's 25) while
significantly improving mAP:
- Nuclei: 27 objects, mAP 0.561 (was 29 objects, mAP 0.468)
- Cells: 23 objects, mAP 0.558 (was 25 objects, mAP 0.506)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(segmentation notebook): use mode="constant" for median filter matching CP
CellProfiler's MedianFilter passes mode="constant" (zero-padding) to
scipy.ndimage.median_filter, while we were using the default "reflect".
Zero-padding darkens border pixels, making them fall below the Otsu
threshold and producing a cleaner binary mask with fewer edge artifacts.
Results: nuclei 26 (was 27), mAP 0.698 (was 0.561).
cells 22, mAP 0.579 (was 0.558).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(segmentation notebook): fix label colormap to show distinct colors per object
The old ListedColormap approach mapped labels 1-26 to the first ~10% of
tab20's range, causing most labels to share the same 2-3 colors. Replace
with a label_cmap() function that maps each label to a distinct tab20
color using modular arithmetic.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(segmentation notebook): use CP-correct hole fill size for cell membrane mask
CellProfiler's RemoveHoles(size=20) interprets 20 as diameter, converting
to sphere volume: pi * (4/3) * 10^3 = 4189 voxels. We were using
area_threshold=20 (only 20 voxels), leaving membrane fragments inside cell
interiors unfilled. This produced a noisier cell mask with more internal
gaps, leading to fragmented watershed results.
Results: cells mAP 0.588 (was 0.579), nuclei unchanged at 0.698/26.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(segmentation notebook): fix Z-slice index for downscaled nuclei visualization
rescale_xy only downscales XY (Z unchanged), so the downscaled binary and
watershed images should use z_mid (not z_mid//2) for the mid-slice.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs(segmentation notebook): clean up plots and remove old pipeline references
- Remove old/new AP comparison, replace with single clean AP curve plot
- Merge nuclei and cells slice comparisons into one 4-row figure
- Remove cubic_paper/old pipeline references from summary table
- Simplify summary to compare only cubic vs CellProfiler
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor(segmentation notebook): clean up imports
- Remove unused ListedColormap import
- Replace direct skimage imports (peak_local_max, watershed, resize) with
cubic.skimage equivalents (feature, segmentation, transform)
- Remove duplicate import of resize in cell 8
- All skimage functions now imported through cubic's device-agnostic proxy
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(segmentation notebook): fill small holes per nucleus after cleanup
After watershed + upscaling + cleanup, some nuclei have small internal
holes (up to 340 voxels) visible as gaps in the XY mid-slice. Add
per-nucleus remove_small_holes(area_threshold=500) to fill these.
Nuclei mAP: 0.709 (was 0.698), AP@IoU=1.0: 0.378 (was 0.308).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs(segmentation notebook): merge AP curves into comparison figure
Remove standalone AP plot cell and add AP curves as 4th column in the
combined nuclei/cells comparison figure. Each AP curve sits next to its
corresponding XY/XZ mask comparisons.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor(segmentation notebook): replace scipy imports with cubic wrappers
- Replace scipy.ndimage.median_filter with cubic.scipy.ndimage.median_filter
- Replace scipy.ndimage.distance_transform_edt with cubic.image_utils.distance_transform_edt
- No direct scipy imports remain — all go through cubic's device-agnostic proxies
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor(segmentation notebook): remove unnecessary asnumpy/to_device calls
Let cubic's device-agnostic proxies handle CPU/GPU routing instead of
manually converting arrays. Removed:
- asnumpy + to_device around ndimage.median_filter (proxy handles it)
- asnumpy(nuc_binary) before distance_transform_edt/peak_local_max
- to_device after segmentation.watershed (proxy returns on correct device)
- asnumpy(nuclei) before transform.resize (proxy handles it)
- asnumpy(cell_mask/seeds) before watershed (proxy handles it)
- asnumpy in hole-fill loop (boolean indexing works on both devices)
Kept: asnumpy(coords) for np.zeros seed creation (needs CPU fancy indexing),
asnumpy in planewise closing loop (np.zeros_like needs CPU), asnumpy for
matplotlib visualization, asnumpy(old_labels) for Python enumerate loop.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs(segmentation notebook): update description to emphasize CPU/GPU portability
Reframe the notebook intro to highlight cubic's device-agnostic API as the
main point, with the CellProfiler reproduction as the demonstration use case.
Add links to the CellProfiler tutorial website and GitHub repository.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(segmentation notebook): correct CellProfiler tutorial URL
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(segmentation notebook): fix GPU compatibility for watershed and footprints
- watershed is not in cucim — explicitly asnumpy inputs before calling
segmentation.watershed, then to_device the result back
- morphology.ball() returns CPU arrays — use to_same_device for footprints
passed to cucim morphology operations (binary_dilation, peak_local_max)
- Add get_device import for preserving device across CPU fallbacks
Verified: runs on both CPU and GPU with identical results.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs(segmentation notebook): add CPU vs GPU runtime to summary
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(docs): correct dataset attribution, remove incorrect BBBC034 reference
The tutorial data (60x256x256, 3 channels: membrane/mito/DNA) does not
match BBBC034 (1024x1024x52, 4 channels: CellMask/GFP/DNA/brightfield).
The data is from the Allen Institute for Cell Science, provided with
the CellProfiler 3D monolayer tutorial.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor(segmentation notebook): consolidate imports and add closing comment
- Move perf_counter and get_device imports to cell 1 with other imports
- Add comment explaining why planewise closing requires CPU roundtrip
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(segmentation notebook): correct CPU vs GPU runtime numbers
Benchmark with proper warmup: nuclei 0.2s GPU vs 5.4s CPU (30x speedup),
cells ~3.2s on both (watershed/closing CPU-bound), total 3.3s vs 9.0s (2.7x).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(docs): clarify why cell pipeline runs on CPU
Watershed is not in cuCIM; planewise closing runs on CPU by design to
match CellProfiler's per-slice 2D behavior, not because of a fallback.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(segmentation notebook): run planewise closing on GPU instead of CPU
Keep mono_ds on GPU and call morphology.closing per Z-slice with a GPU
disk footprint via to_same_device, instead of asnumpy → CPU loop → to_device.
Cell segmentation: 1.86s (was 3.49s on GPU, ~1.9x faster).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor(segment_utils): add dilate_seeds parameter to segment_watershed
Add dilate_seeds option that dilates seed points with ball(1) before
labeling, matching CellProfiler's watershed seed dilation behavior.
Also fix GPU compatibility: use to_same_device for footprints and
asnumpy for watershed inputs (not in cucim).
Simplifies notebook nuclei watershed from 12 lines of inline code to:
segment_watershed(nuc_binary, ball_size=10, dilate_seeds=True)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor(segment_utils): add filter_mode to downscale_and_filter, use rescale_xy
- Add filter_mode parameter for boundary handling (default "nearest");
when filter_shape="square", uses scipy.ndimage.median_filter with the
specified mode instead of skimage.filters.median
- Fix: use rescale_xy instead of transform.rescale to only downscale XY
(preserving Z dimension for 3D images)
- Notebook nuclei step 2 simplified to:
downscale_and_filter(dna_norm, filter_size=5, filter_mode="constant")
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(segment_utils): add downscale_xy_only param to downscale_and_filter
Allow choosing between XY-only downscaling (default, preserves Z) and
uniform downscaling of all dimensions (downscale_xy_only=False).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor(segment_utils): add mask param to segment_watershed for cell pipeline
When markers and mask are both provided, segment_watershed now computes
the negated EDT of the mask as the watershed landscape (shape-based
partitioning), matching CellProfiler's declumping behavior.
Notebook cell pipeline simplified from 6 lines of inline EDT+watershed to:
segment_watershed(cell_mask, markers=seeds, mask=cell_mask)
Also removed unused cubic.skimage.segmentation import from notebook.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor(segmentation notebook): use cleanup_segmentation max_hole_size param
Replace manual per-label hole-fill loop with the existing max_hole_size
parameter of cleanup_segmentation:
cleanup_segmentation(nuclei, min_obj_size=50, max_hole_size=500)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(segmentation notebook): revert cell cleanup to manual relabeling
cleanup_segmentation calls label() which splits disconnected pieces of
the same cell into separate objects (22 -> 29 cells, mAP 0.588 -> 0.465).
The manual relabeling preserves watershed label identity across
disconnected 3D pieces.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: add feature extraction notebook, examples extra, notebook CI workflow
- Add examples/notebooks/feature_extraction_3d.ipynb: GPU-accelerated
regionprops_table on cells3d (18 objects, 136 features)
- Add examples/scripts/generated CI workflow: auto-converts notebooks to
scripts on push to main
- Add [examples] optional dependency group (jupyter, pandas, pooch)
- Fix cubic.feature.voxel: convert spacing list to tuple for cucim compat
- Remove redundant examples/scripts/regionprops_example.py
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor: remove pandas dependency
Replace pandas DataFrame usage in feature_extraction_3d notebook with
plain numpy dict + formatted print. Remove pandas from [examples] extra.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: address code review and Copilot feedback on PR #33
Code review fixes:
- Fix regionprops() missing spacing tuple conversion for cucim compat
- Remove unnecessary asnumpy(distance).shape — .shape works on both devices
Copilot feedback fixes:
- Make downscale_xy_only/filter_mode keyword-only in downscale_and_filter
to preserve positional arg compatibility
- Make mask/dilate_seeds keyword-only in segment_watershed, keep ball_size
as 3rd positional arg for backward compat
- Replace assert with ValueError for filter_shape validation
- Fix numpy scalar formatting in feature_extraction_3d notebook
(np.issubdtype check instead of isinstance(x, float))
- Use uv sync + uv run in notebooks CI workflow (matches lint-format.yml)
- Remove dangling [tool.uv.sources] iohub entry from pyproject.toml
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(segment_utils): remove uint8 truncation in cleanup_segmentation
cleanup_segmentation was casting label() output to uint8, silently
truncating labels >255 to 0. Now returns the native int32/int64 dtype
from label(). Callers that need a specific dtype already cast explicitly.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor(segmentation notebook): remove redundant uint16 cast after cleanup_segmentation
cleanup_segmentation now returns uint16 natively.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs: add feature extraction notebook to README examples table
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Alexandr Kalinin <alxndrkalinin@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>1 parent 8c97f84 commit aa41e7a
File tree
9 files changed
+1392
-46
lines changed- .github/workflows
- cubic
- feature
- segmentation
- examples
- data
- notebooks
9 files changed
+1392
-46
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
80 | 80 | | |
81 | 81 | | |
82 | 82 | | |
| 83 | + | |
| 84 | + | |
83 | 85 | | |
84 | 86 | | |
85 | 87 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
16 | 16 | | |
17 | 17 | | |
18 | 18 | | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
19 | 22 | | |
20 | 23 | | |
21 | 24 | | |
| |||
30 | 33 | | |
31 | 34 | | |
32 | 35 | | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
33 | 39 | | |
34 | 40 | | |
35 | 41 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
20 | 20 | | |
21 | 21 | | |
22 | 22 | | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
23 | 26 | | |
24 | | - | |
| 27 | + | |
25 | 28 | | |
26 | 29 | | |
27 | 30 | | |
28 | 31 | | |
29 | 32 | | |
30 | 33 | | |
31 | 34 | | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
32 | 42 | | |
33 | 43 | | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
34 | 53 | | |
35 | 54 | | |
36 | 55 | | |
37 | 56 | | |
38 | 57 | | |
39 | 58 | | |
40 | 59 | | |
41 | | - | |
42 | | - | |
43 | | - | |
44 | | - | |
45 | | - | |
46 | | - | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
47 | 85 | | |
48 | 86 | | |
49 | | - | |
50 | | - | |
51 | | - | |
| 87 | + | |
52 | 88 | | |
53 | | - | |
54 | | - | |
55 | | - | |
| 89 | + | |
56 | 90 | | |
57 | 91 | | |
58 | 92 | | |
59 | | - | |
60 | | - | |
61 | | - | |
62 | | - | |
63 | | - | |
64 | | - | |
65 | | - | |
66 | | - | |
67 | | - | |
| 93 | + | |
68 | 94 | | |
69 | 95 | | |
70 | 96 | | |
| |||
120 | 146 | | |
121 | 147 | | |
122 | 148 | | |
123 | | - | |
| 149 | + | |
124 | 150 | | |
125 | 151 | | |
126 | 152 | | |
| |||
253 | 279 | | |
254 | 280 | | |
255 | 281 | | |
256 | | - | |
257 | | - | |
258 | | - | |
| 282 | + | |
| 283 | + | |
| 284 | + | |
| 285 | + | |
| 286 | + | |
| 287 | + | |
| 288 | + | |
| 289 | + | |
| 290 | + | |
259 | 291 | | |
260 | | - | |
261 | | - | |
262 | | - | |
263 | | - | |
| 292 | + | |
| 293 | + | |
| 294 | + | |
| 295 | + | |
| 296 | + | |
| 297 | + | |
| 298 | + | |
| 299 | + | |
264 | 300 | | |
265 | | - | |
| 301 | + | |
| 302 | + | |
| 303 | + | |
| 304 | + | |
| 305 | + | |
| 306 | + | |
| 307 | + | |
| 308 | + | |
| 309 | + | |
| 310 | + | |
| 311 | + | |
| 312 | + | |
| 313 | + | |
| 314 | + | |
| 315 | + | |
| 316 | + | |
| 317 | + | |
| 318 | + | |
| 319 | + | |
| 320 | + | |
| 321 | + | |
| 322 | + | |
| 323 | + | |
| 324 | + | |
| 325 | + | |
| 326 | + | |
| 327 | + | |
| 328 | + | |
| 329 | + | |
| 330 | + | |
| 331 | + | |
266 | 332 | | |
267 | | - | |
268 | | - | |
269 | | - | |
270 | | - | |
271 | | - | |
272 | | - | |
273 | | - | |
274 | | - | |
275 | | - | |
| 333 | + | |
| 334 | + | |
| 335 | + | |
| 336 | + | |
| 337 | + | |
| 338 | + | |
| 339 | + | |
| 340 | + | |
| 341 | + | |
| 342 | + | |
| 343 | + | |
| 344 | + | |
| 345 | + | |
| 346 | + | |
| 347 | + | |
| 348 | + | |
| 349 | + | |
| 350 | + | |
| 351 | + | |
| 352 | + | |
| 353 | + | |
| 354 | + | |
| 355 | + | |
| 356 | + | |
| 357 | + | |
| 358 | + | |
| 359 | + | |
276 | 360 | | |
277 | 361 | | |
278 | 362 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
32 | 32 | | |
33 | 33 | | |
34 | 34 | | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
0 commit comments