Skip to content

differences between segmentation report and cell extraction in scportrait 1.4.0. #323

@machth

Description

@machth

Describe the bug
scportrait 1.4.0 gives different numbers for cells in segmentation workflow and extraction workflow. Additionally number is quite significantly reduced (about 14%) compared to scportrait 1.3.5.

segmentation running project.segment():
[...]
[04/09/2025 16:57:22] Stitching tile 8
WARNING:ome_zarr.io:version mismatch: detected: RasterFormatV02, requested: FormatV04
[04/09/2025 16:57:22] Time taken to cleanup overlapping shard regions for shard 8: 0.23852157592773438s
[04/09/2025 16:57:22] Number of classes contained in shard after processing: 1800
[04/09/2025 16:57:22] Number of Ids in filtered_classes after adding shard 8: 29127
[04/09/2025 16:57:22] Finished stitching tile 8 in 0.47019386291503906 seconds.
[04/09/2025 16:57:22] Number of filtered classes in Dataset: 29127
[04/09/2025 16:57:22] Filtering status for this segmentation is set to True.
[04/09/2025 16:57:22] Filtering has been performed during segmentation. Nucleus and Cytosol IDs match. No additional steps are required.
[04/09/2025 16:57:22] Saved cell_id classes to file /Users/mcthielert/Documents/github/ZZ_Notebooks/250224_Script_development/20250722temporar_LucasProcess_TestScript_DVP088_30610456_scene1/segmentation/classes.csv.
[04/09/2025 16:57:22] resolved sharding plan.
WARNING:ome_zarr.io:version mismatch: detected: RasterFormatV02, requested: FormatV04
[04/09/2025 16:57:29] Segmentation seg_all_cytosol written to sdata object.
WARNING:ome_zarr.io:version mismatch: detected: RasterFormatV02, requested: FormatV04
WARNING:ome_zarr.io:version mismatch: detected: RasterFormatV02, requested: FormatV04
WARNING:ome_zarr.io:version mismatch: detected: RasterFormatV02, requested: FormatV04
[04/09/2025 16:57:41] Points centers_seg_all_cytosol written to sdata object.
[04/09/2025 16:57:41] Cleaned up temporary directory at <TemporaryDirectory '/var/folders/nb/j6j3kq9x7yj_5mw8f5fjy60m0000gn/T/./ShardedCytosolOnlySegmentationCellpose_kj388scg'>
[04/09/2025 16:57:41] finished saving segmentation results to sdata object for sharded segmentation.
[04/09/2025 16:57:41] Deleting intermediate tile results to free up storage space
WARNING:ome_zarr.io:version mismatch: detected: RasterFormatV02, requested: FormatV04
WARNING:ome_zarr.io:version mismatch: detected: RasterFormatV02, requested: FormatV04
WARNING:ome_zarr.io:version mismatch: detected: RasterFormatV02, requested: FormatV04
[04/09/2025 16:57:42] Deleting intermediate tile results to free up storage space
[04/09/2025 16:57:42] === completed sharded segmentation ===

extraction by running project.extract():
[...]
WARNING:ome_zarr.io:version mismatch: detected: RasterFormatV02, requested: FormatV04
WARNING:ome_zarr.io:version mismatch: detected: RasterFormatV02, requested: FormatV04
WARNING:ome_zarr.io:version mismatch: detected: RasterFormatV02, requested: FormatV04
WARNING:ome_zarr.io:version mismatch: detected: RasterFormatV02, requested: FormatV04
[04/09/2025 16:57:44] Initialized temporary directory at /var/folders/nb/j6j3kq9x7yj_5mw8f5fjy60m0000gn/T/./HDF5CellExtraction_befbrj2q for HDF5CellExtraction
[04/09/2025 16:57:44] Created new directory for extraction results: /Users/mcthielert/Documents/github/ZZ_Notebooks/250224_Script_development/20250722temporar_LucasProcess_TestScript_DVP088_30610456_scene1/extraction/data
[04/09/2025 16:57:44] Setup output folder at /Users/mcthielert/Documents/github/ZZ_Notebooks/250224_Script_development/20250722temporar_LucasProcess_TestScript_DVP088_30610456_scene1/extraction/data
WARNING:ome_zarr.io:version mismatch: detected: RasterFormatV02, requested: FormatV04
WARNING:ome_zarr.io:version mismatch: detected: RasterFormatV02, requested: FormatV04
[04/09/2025 16:57:45] Found 1 segmentation masks for the given key in the sdata object. Will be extracting single-cell images based on these masks: ['seg_all_cytosol']
[04/09/2025 16:57:45] Using seg_all_cytosol as the main segmentation mask to determine cell centers.
WARNING:ome_zarr.io:version mismatch: detected: RasterFormatV02, requested: FormatV04
[04/09/2025 16:57:45] A total of 0 cells were too close to the image border to be extracted. Their cell_ids were saved to file /Users/mcthielert/Documents/github/ZZ_Notebooks/250224_Script_development/20250722temporar_LucasProcess_TestScript_DVP088_30610456_scene1/extraction/data/removed_classes.csv.
[04/09/2025 16:57:45] Container for single-cell data created.
[04/09/2025 16:57:45] Extraction Details:
[04/09/2025 16:57:45] --------------------------------
[04/09/2025 16:57:45] Number of input image channels: 6
[04/09/2025 16:57:45] Number of segmentation masks used during extraction: 1
[04/09/2025 16:57:45] Number of generated output images per cell: 7
[04/09/2025 16:57:45] Number of unique cells to extract: 29546
[04/09/2025 16:57:45] Extracted Image Dimensions: 128 x 128
[04/09/2025 16:57:45] Normalization of extracted images: False
[04/09/2025 16:57:45] Percentile normalization range for single-cell images: ('None', 'None')
[04/09/2025 16:57:45] Starting single-cell image extraction of 29546 cells...
[04/09/2025 16:57:45] Loading input images to memory mapped arrays...
WARNING:ome_zarr.io:version mismatch: detected: RasterFormatV02, requested: FormatV04
[04/09/2025 16:57:57] Finished transferring data to memory mapped arrays. Time taken: 11.77 seconds.
[04/09/2025 16:57:57] Using batch size of 370 for multiprocessing.
[04/09/2025 16:57:57] Running in multiprocessing mode with 80 threads.
Extracting cell batches: 100%|██████████| 80/80 [00:16<00:00, 4.96it/s]
[04/09/2025 16:58:21] Finished extraction in 24.34 seconds (1213.68 cells / second)
WARNING:ome_zarr.io:version mismatch: detected: RasterFormatV02, requested: FormatV04
WARNING:ome_zarr.io:version mismatch: detected: RasterFormatV02, requested: FormatV04
[04/09/2025 16:58:22] Benchmarking times saved to file.
[04/09/2025 16:58:22] Cleaned up temporary directory at <TemporaryDirectory '/var/folders/nb/j6j3kq9x7yj_5mw8f5fjy60m0000gn/T/./HDF5CellExtraction_befbrj2q'>
WARNING:ome_zarr.io:version mismatch: detected: RasterFormatV02, requested: FormatV04

To Reproduce

  • OS: iOS, Mac

Test case 1: scportrait 1.4.0
Test case 2: scportrait 1.3.5.

this is the environment (scportrait==1.4.0 or scportrait==1.3.5)
aiobotocore==2.24.1
aiohappyeyeballs==2.6.1
aiohttp==3.12.15
aioitertools==0.12.0
aiosignal==1.4.0
alabaster==1.0.0
alphabase==1.6.2
anndata==0.11.4
annotated-types==0.7.0
app-model==0.4.0
appdirs==1.4.4
appnope @ file:///home/conda/feedstock_root/build_artifacts/appnope_1733332318622/work
array-api-compat==1.12.0
asciitree==0.3.3
asttokens @ file:///home/conda/feedstock_root/build_artifacts/asttokens_1733250440834/work
attrs==25.3.0
babel==2.17.0
biopython==1.85
botocore==1.39.11
build==1.3.0
cachey==0.2.1
cellpose==3.1.1.2
certifi==2025.8.3
charset-normalizer==3.4.3
click==8.2.1
cloudpickle==3.1.1
cmake==4.1.0
colorcet==3.1.0
comm @ file:///home/conda/feedstock_root/build_artifacts/bld/rattler-build_comm_1753453984/work
contextlib2==21.6.0
contourpy==1.3.3
cycler==0.12.1
dask==2024.11.2
dask-expr==1.1.19
dask-image==2024.5.3
datashader==0.18.2
debugpy @ file:///Users/runner/miniforge3/conda-bld/bld/rattler-build_debugpy_1754523486/work
decorator @ file:///home/conda/feedstock_root/build_artifacts/decorator_1740384970518/work
Deprecated==1.2.18
docstring_parser==0.17.0
docutils==0.21.2
dvp-io==0.3.0
exceptiongroup @ file:///home/conda/feedstock_root/build_artifacts/exceptiongroup_1746947292760/work
executing @ file:///home/conda/feedstock_root/build_artifacts/executing_1745502089858/work
fasteners==0.20
fastremap==1.17.2
filelock==3.19.1
fill_voids==2.1.0
flexcache==0.3
flexparser==0.4
fonttools==4.59.1
freetype-py==2.5.1
frozenlist==1.7.0
fsspec==2025.7.0
geopandas==1.1.1
h5py==3.14.0
HeapDict==1.0.1
hf-xet==1.1.8
hilbertcurve==2.0.5
hsluv==5.0.4
huggingface-hub==0.34.4
idna==3.10
imagecodecs==2025.8.2
imageio==2.37.0
imagesize==1.4.1
importlib_metadata @ file:///home/conda/feedstock_root/build_artifacts/bld/rattler-build_importlib-metadata_1747934053/work
in-n-out==0.2.1
ipykernel @ file:///Users/runner/miniforge3/conda-bld/ipykernel_1754352890318/work
ipython @ file:///home/conda/feedstock_root/build_artifacts/bld/rattler-build_ipython_1751465044/work
ipython_pygments_lexers @ file:///home/conda/feedstock_root/build_artifacts/ipython_pygments_lexers_1737123620466/work
jedi @ file:///home/conda/feedstock_root/build_artifacts/jedi_1733300866624/work
Jinja2==3.1.6
jmespath==1.0.1
joblib==1.5.1
jsonschema==4.25.1
jsonschema-specifications==2025.4.1
jupyter_client @ file:///home/conda/feedstock_root/build_artifacts/jupyter_client_1733440914442/work
jupyter_core @ file:///home/conda/feedstock_root/build_artifacts/jupyter_core_1748333051527/work
kiwisolver==1.4.9
lazy_loader==0.4
legacy-api-wrap==1.4.1
lightning-utilities==0.15.2
llvmlite==0.44.0
locket==1.0.0
loguru==0.7.3
lxml==6.0.0
magicgui==0.10.1
mahotas==1.4.18
markdown-it-py==4.0.0
MarkupSafe==3.0.2
matplotlib==3.10.5
matplotlib-inline @ file:///home/conda/feedstock_root/build_artifacts/matplotlib-inline_1733416936468/work
matplotlib-scalebar==0.9.0
mdurl==0.1.2
mpmath==1.3.0
multidict==6.6.4
multipledispatch==1.0.0
multiscale_spatial_image==2.0.3
napari==0.6.4
napari-console==0.1.3
napari-matplotlib==3.0.0
napari-plugin-engine==0.2.0
napari-spatialdata==0.5.7
napari-svg==0.2.1
natsort==8.4.0
nest_asyncio @ file:///home/conda/feedstock_root/build_artifacts/nest-asyncio_1733325553580/work
networkx==3.5
npe2==0.7.9
numba==0.61.2
numcodecs==0.15.1
numpy==2.0.2
numpydoc==1.9.0
ome-zarr==0.11.1
opencv-python-headless==4.12.0.88
openslide-bin==4.0.0.8
openslide-python==1.4.2
packaging @ file:///home/conda/feedstock_root/build_artifacts/bld/rattler-build_packaging_1745345660/work
pandas==2.3.1
param==2.2.1
parso @ file:///home/conda/feedstock_root/build_artifacts/parso_1733271261340/work
partd==1.4.2
patsy==1.0.1
pexpect @ file:///home/conda/feedstock_root/build_artifacts/pexpect_1733301927746/work
pickleshare @ file:///home/conda/feedstock_root/build_artifacts/pickleshare_1733327343728/work
pillow==11.3.0
PIMS==0.7
Pint==0.25
platformdirs @ file:///home/conda/feedstock_root/build_artifacts/bld/rattler-build_platformdirs_1746710438/work
pooch==1.8.2
prompt_toolkit @ file:///home/conda/feedstock_root/build_artifacts/prompt-toolkit_1744724089886/work
propcache==0.3.2
psutil @ file:///Users/runner/miniforge3/conda-bld/psutil_1740663164053/work
psygnal==0.14.1
ptyprocess @ file:///home/conda/feedstock_root/build_artifacts/ptyprocess_1733302279685/work/dist/ptyprocess-0.7.0-py2.py3-none-any.whl#sha256=92c32ff62b5fd8cf325bec5ab90d7be3d2a8ca8c8a3813ff487a8d2002630d1f
pure_eval @ file:///home/conda/feedstock_root/build_artifacts/pure_eval_1733569405015/work
py-lmd==1.3.2
pyahocorasick==2.2.0
pyarrow==21.0.0
pyconify==0.2.1
pyct==0.5.0
pydantic==2.11.7
pydantic-compat==0.1.2
pydantic_core==2.33.2
Pygments @ file:///home/conda/feedstock_root/build_artifacts/pygments_1750615794071/work
pylibCZIrw==5.0.0
pynndescent==0.5.13
pyogrio==0.11.1
PyOpenGL==3.1.10
pyparsing==3.2.3
pyproj==3.7.2
pyproject_hooks==1.2.0
PyQt5==5.15.11
PyQt5-Qt5==5.15.17
PyQt5_sip==12.17.0
pyqtgraph==0.13.7
pyteomics==4.7.5
python-dateutil @ file:///home/conda/feedstock_root/build_artifacts/bld/rattler-build_python-dateutil_1751104122/work
pytorch-lightning==2.5.3
pytz==2025.2
PyYAML==6.0.2
pyzmq @ file:///Users/runner/miniforge3/conda-bld/pyzmq_1754238162672/work
qtconsole==5.6.1
QtPy==2.4.3
rdkit==2025.3.5
rdp==0.8
referencing==0.36.2
regex==2025.7.34
requests==2.32.5
rich==14.1.0
roifile==2025.5.10
roman-numerals-py==3.1.0
rpds-py==0.27.0
s3fs==2025.7.0
scanpy==1.11.4
scikit-fmm==2025.6.23
scikit-image==0.25.2
scikit-learn==1.7.1
scipy==1.16.1
scportrait==1.4.0
seaborn==0.13.2
session-info2==0.2
shapely==2.1.1
shellingham==1.5.4
six @ file:///home/conda/feedstock_root/build_artifacts/bld/rattler-build_six_1753199211/work
slicerator==1.1.0
snowballstemmer==3.0.1
spatial_image==1.2.3
spatialdata==0.5.0
spatialdata-plot==0.2.11
Sphinx==8.2.3
sphinxcontrib-applehelp==2.0.0
sphinxcontrib-devhelp==2.0.0
sphinxcontrib-htmlhelp==2.1.0
sphinxcontrib-jsmath==1.0.1
sphinxcontrib-qthelp==2.0.0
sphinxcontrib-serializinghtml==2.0.0
stack_data @ file:///home/conda/feedstock_root/build_artifacts/stack_data_1733569443808/work
statsmodels==0.14.5
superqt==0.7.6
svgelements==1.9.6
sympy==1.14.0
threadpoolctl==3.6.0
tifffile==2025.6.11
tinycss2==1.4.0
tokenizers==0.13.3
tomli_w==1.2.0
toolz==1.0.0
torch==2.8.0
torchmetrics==1.8.1
torchvision==0.23.0
tornado @ file:///Users/runner/miniforge3/conda-bld/tornado_1754732056163/work
tqdm==4.67.1
traitlets @ file:///home/conda/feedstock_root/build_artifacts/traitlets_1733367359838/work
transformers==4.26.0
typer==0.16.1
typing-inspection==0.4.1
typing_extensions @ file:///home/conda/feedstock_root/build_artifacts/bld/rattler-build_typing_extensions_1751643513/work
tzdata==2025.2
umap-learn==0.5.9.post2
urllib3==2.5.0
validators==0.35.0
vispy==0.15.2
wcwidth @ file:///home/conda/feedstock_root/build_artifacts/wcwidth_1733231326287/work
webencodings==0.5.1
wrapt==1.17.3
xarray==2025.8.0
xarray-dataclass==3.0.0
xarray-datatree==0.0.14
xarray-schema==0.0.3
xarray-spatial==0.4.0
xmltodict==0.14.2
xxhash==3.5.0
yarl==1.20.1
zarr==2.18.7

Expected behavior

  1. same cell number in cell segmentation and cell extraction?
  2. similar performance as scportrait 1.3.5 -> about 34000 cells

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions