Contents:
3. Downscaling and preparing images
4. Generating sinlge-cell data with CellProfiler
In the conda prompt, activate wholetumorseg environment
conda activate wholetumorseg
Make sure you are still in WholeTumorSeg directory C:\YOUR\PATH_TO\WholeTumorSeg
Launch jupyter lab with
jupyter lab
Open the pipeline folder and run 00_download_example_data.ipynb
Downloading files...
File downloaded successfully: ../raw_data\oscc_data.zip
Extracting files...
Files extracted to: ../raw_data
Open the 01_crete_image_tiles.ipynb
We recommend to start working with the provided example data. You can later change the parameters to fit your data. Make sure to include all the neccessary files with correct extensions and use the same file naming structure as demonstrated in the example data. Each sample should be located in separate folder.
Run the code from all cells
Crops saved successfully!
Overview with grid saved successfully!
Crops saved successfully!
Overview with grid saved successfully!
All done!
Tiles and the overview image with grid are saved in the provided output folder
WholeTumorSeg/analysis/example
Open new anaconda prompt and change directory to WholeTumorSeg cd WholeTumorSeg
Install Cellpose with conda
Skip this step if you have Cellpose environment already installed. You can also delete it with conda remove --name cellpose --all and redo the installation.
Install Cellpose by creating new environment
conda create --name cellpose python=3.10
conda activate cellpose
python -m pip install cellpose[gui]
Start Cellpose
Activate cellpose conda environment
conda activate cellpose
Run CellPose segmentation
You will have to train a new model for your own images for best results. Check Cellpose documentation how to train a model for your data.
For sample 2 change set SAMPLE=patient_2
set SAMPLE=patient_1
python -m cellpose --verbose --dir ./analysis/example/%SAMPLE%/tiles_2x/ --pretrained_model cellpose_OSCCIF --chan 2 --chan2 3 --diameter 25 --save_png --dir_above --no_npy --savedir ./analysis/example/%SAMPLE%/masks_2x/
--dir = path to tiled images
--pretrained_model = path to trained cellpose model
--chan = channel used for nuclear segmentation
--chan2 = channel used for cytoplasm segmentation
--diameter = estimated cell diameter
--save_png = saves masks as .png files
--dir_above = output directory for the masks not the same as the images
--no_npy = we won't need .npy files
--save_dir = directory for the masks
You may see some error messages since some image tiles are empty, but the process should still run normally.
2025-03-05 13:23:06,337 [INFO] 11%|#1 | 2/18 [00:58<07:33, 28.36s/it]
2025-03-05 13:23:28,936 [WARNING] no masks found, will not save PNG or outlines
2025-03-05 13:23:28,937 [INFO] 17%|#6 | 3/18 [01:20<06:25, 25.73s/it]
2025-03-05 13:24:02,069 [INFO] 22%|##2 | 4/18 [01:54<06:41, 28.65s/it]
2025-03-05 13:24:36,362 [INFO] 28%|##7 | 5/18 [02:28<06:38, 30.69s/it]
2025-03-05 13:25:01,245 [INFO] 33%|###3 | 6/18 [02:53<05:44, 28.71s/it]
Masks are saved in masks_2x folder. After segmentation is finished, you can close the Cellpose conda prompt with exit
Open jupyter lab. Open the pipeline folder and 03_resize_and_separate.ipynb
Run the code from all cells
All images resized, separated by channel, and saved successfully!
Deleting folder: ../analysis/example/patient_1/tiles\patient_1_A3
Deleting folder: ../analysis/example/patient_1/tiles\patient_1_A4
Deleting folder: ../analysis/example/patient_1/tiles\patient_1_B4
Excess tiles removed!
All images resized, separated by channel, and saved successfully!
Deleting folder: ../analysis/example/patient_2/tiles\patient_2_F1
Deleting folder: ../analysis/example/patient_2/tiles\patient_2_F4
Excess tiles removed!
All done
Tiles are now downscaled and separated into single channel images into tiles folder. All empty tiles are deleted.
Install CellProfiler if not already installed.
Make new folder WholeTumorSeg/rdata/example for data output and launch CellProfiler.exe
Open preferences and change default output to the new folder WholeTumorSeg/rdata/example
Import pipeline 04_measure.cppipe to Cellprofiler. File -> Import -> import pipeline from WholeTumorSeg/pipeline/04_measure.cppipe
Load data
- Drag and drop processed data
analysis/example/folder to images in Cellprofiler
Define channels and measurements
If your are using your own data:
NamesAndTypes -> adjust the channels and names accordingly.
ResizeObjects -> choose your nuclear channel to the last step
FilterObjects -> choose the minimum value for max radius to filter very small cells. 2.0 by default.
MeasureObjectNeighbors -> select the distance that is used to detect neighboring cells. 12 by deafult.
MeasureObjectIntensity and MeasureObjectIntensityDistribution -> select the channels you want to measure.
Calculate intensity Zernikes -> Disable to shorten the run time.
GrayToColor -> adjust the channels and names accordingly.
Run the pipeline
Press Analyze Images to start the pipeline. full_stacks and masks folders are created to output folder.
After the measurement is finnished, you should have following data in the output folder:
full_stacks
masks
WholeTumor_Cells.csv
WholeTumor_Image.csv
WholeTumor_Object_relationships.csv
If you are analysing very large samples, we recommend to run CellProfiler separately for each patient (sample). In this case you drag the analysis data from one sample at a time.
Jupyter Lab
Open the pipeline folder and 05_line_annotations.ipynb
input_folder is the folder with raw unprocessed data including the patient_1_annotations.png
Output folder should be the same where you saved CellProfiler output data.
Guide for making custom line annotations (see the example data):
-
Annotations should be drawn on the overview image.
-
Use red (rgb: 255,0,0) for the structure A (e.g. tumor front), this is labeled as
redline. -
Use blue (rgb: 0,0,255) for the structure B (e.g. healthy eptihelium), this is labeled as
blueline. -
Export annotations in 8bit .tif format
Run the code
Combined scaling factor for x: 5
Combined scaling factor for y: 5
Dimensions saved to ../rdata/example/patient_1/WholeTumor_Dimensions.csv
Line coordinates saved to ../rdata/example/patient_1/WholeTumor_Annotations.csv
Combined scaling factor for x: 5
Combined scaling factor for y: 5
Dimensions saved to ../rdata/example/patient_2/WholeTumor_Dimensions.csv
Line coordinates saved to ../rdata/example/patient_2/WholeTumor_Annotations.csv
All done!
Combined scaling factor is 5 because: overview image is 1/10 from the original image and the tiles are 2x downscaled (1/10 : 1/2 = 1/5)
Now all data has been generated to build spatial single-cell dataset in R
full_stacks
masks
WholeTumor_Annotations.csv
WholeTumor_Cells.csv
WholeTumor_Dimensions.csv
WholeTumor_Image.csv
WholeTumor_Object_relationships.csv
Install R and Rstudio
Start Rstudio -> File -> New Project... -> Existing Directory
Choose the folder with all output data WholeTumorSeg/rdata/example
Install dplyr, SpatialExperiment, stringr and cytomapper packages if they are not already installed.
Load pacakges and tools
file should point to the WholeTumorSeg main directory.
# Load libraries
library(dplyr)
library(SpatialExperiment)
library(stringr)
# Load functions from RforWholeTumor.R
source(file = "../../RforWholeTumor.R")SpatialExperiment object
Use readWholeTdata function to generate spatial experiment object
Path should point to the folder with all CellProfier .csv data. In this example: /patient_1/
markers_order should contain all CellProfiler channels from first to last.
# Load data to create spatial experiment
spe1 <- readWholeTdata(path = "patient_1/", markers_order = c("PROX1", "DAPI", "CK", "KI67"))
# inspect rowdata
rowData(spe1)
# inspect coldata
colData(spe1) %>% as_tibble() %>% head()
# inspect metadata
metadata(spe1)$WholeTissueCoords %>% head() # scaled coordinates
metadata(spe1)$LineAnnotations %>% as_tibble() %>% head() # annotations
metadata(spe1)$OriginalImageDimensions %>% as_tibble() %>% head() # original image dimensionsOutput:
DataFrame with 4 rows and 2 columns
marker_name channel
<character> <character>
PROX1 PROX1 ch01
DAPI DAPI ch02
CK CK ch03
KI67 KI67 ch04
# A tibble: 6 × 9
ObjectNumber cell_id sample_id image image_id tile_id tile_rows tile_columns area
<int> <chr> <chr> <int> <chr> <chr> <chr> <chr> <int>
1 1 1_1 patient_1 1 patient_1_A1 A1 A 1 83
2 2 1_2 patient_1 1 patient_1_A1 A1 A 1 256
3 3 1_3 patient_1 1 patient_1_A1 A1 A 1 283
4 4 1_4 patient_1 1 patient_1_A1 A1 A 1 184
5 5 1_5 patient_1 1 patient_1_A1 A1 A 1 150
6 6 1_6 patient_1 1 patient_1_A1 A1 A 1 97
x y
[1,] 157.7349 2.867470
[2,] 186.1133 7.371094
[3,] 317.8657 7.431095
[4,] 354.1685 9.657609
[5,] 389.8867 4.620000
[6,] 412.4639 5.463918
# A tibble: 6 × 3
y x line_type
<int> <int> <chr>
1 1927 1832 red
2 1927 1833 red
3 1927 1834 red
4 1927 1835 red
5 1927 1836 red
6 1927 1837 red
# A tibble: 2 × 3
Img Width Height
<chr> <int> <int>
1 full_size 5876 9223
2 half_size 2938 4611
Load images and masks with cytomapper
We use cytomapper to load and visualize images with segmentation masks.
# Load cytomapper library
library(cytomapper)# read in images and masks
images1 <- loadImages("patient_1/full_stacks/", pattern = "tiff")
masks1 <- loadImages("patient_1/masks/", pattern = "tiff", as.is = TRUE)
# Insert channel names to images
channelNames(images1) <- rownames(spe1)
# clean empty images and create mcols with cleanImages function
images1 <- cleanImages(images1, spe1)
masks1 <- cleanImages(masks1, spe1)Lets inspect segmentation masks with plotPixels
# Tiles to plot
to_plot_1 <- c("patient_1_A1")
# Tiles to plot
to_plot_2 <- c("patient_1_A1","patient_1_A2","patient_1_B1","patient_1_B2")
# Plot 1 tile with masks
plotPixels(spe1,
image = images1[names(images1) %in% to_plot_2],
mask = masks1[names(masks1) %in% to_plot_2],
colour_by = c("DAPI","CK"),
colour = list("DAPI" = c("black", "blue"),"CK" = c("black","green")),
bcg = list("DAPI" = c(0, 6, 1), "CK" = c(0, 6, 1)),
img_id = "image_id",
cell_id = "ObjectNumber",
image_title = NULL,
legend = NULL,
thick = TRUE)
# Plot 4 tiles with masks
plotPixels(spe1,
image = images1[names(images1) %in% to_plot_2],
mask = masks1[names(masks1) %in% to_plot_2],
colour_by = c("DAPI","CK"),
colour = list("DAPI" = c("black", "blue"),"CK" = c("black","green")),
bcg = list("DAPI" = c(0, 6, 1), "CK" = c(0, 6, 1)),
img_id = "image_id",
cell_id = "ObjectNumber",
image_title = NULL,
legend = NULL,
thick = TRUE)Output:
Load sample 2 and merge data
MergeSpe function also combines metadata with unique identifiers
# Load all data to create spatial experiment
spe2 <- readWholeTdata(path = "patient_2/", markers_order = c("PROX1", "DAPI", "CK", "KI67"))
# Combine samples with mergeSpe
spe_merged <- mergeSpe(spe1 = spe1, spe2_list = spe2)Load additional libraries
# Load libraries
# Visualization and celltypes
library(tidyr)
library(ggplot2)
library(RColorBrewer)
library(ggpubr)Here we use fromAssay function to get marker expression data and perform kmeans clustering. We visualize the clusters with vlnplot.
# Simple clustering with kmeans
use_markers <- c("CK", "PROX1")
set.seed(1306)
spe_merged <- fromAssay(spe = spe_merged,
markers = use_markers,
assay = "counts",
kmeans_centers = 4,
nstart = 50)
# kmeans visualization
vlnPlot(spe = spe_merged,
markers = use_markers,
cluster_col = "kmeans",
show.legend = F)Labeling cell types based on these clusters.
# assigning cell types
spe_merged$celltype <- NA
spe_merged$celltype[spe_merged$kmeans == "1"] <- "LEC"
spe_merged$celltype[spe_merged$kmeans %in% c("2","4")] <- "CK+"
spe_merged$celltype[is.na(spe_merged$celltype)] <- "other"Visualizating cell types in whole tumor scale using wholeTplot.
# Whole tumor visualization
color_vector <- c("CK+" = "darkgreen", "LEC" = "black", "other" = "gray40")
size_vector <- c("CK+" = 1, "LEC" = 2, "other" = 0.001)
alpha_vector <- c("CK+" = 0.7, "LEC" = 0.9, "other" = 0.1)
lst1 <- wholeTplot(spe = spe_merged,
annotation_metadata = "celltype",
colors = color_vector,
size = size_vector,
alpha = alpha_vector)
lst2 <- wholeTplot(spe = spe_merged,
annotation_metadata = "celltype",
colors = color_vector,
size = size_vector,
alpha = alpha_vector,
draw_line_annotations = T)
ggarrange(plotlist = lst1, align = "hv", common.legend = T, labels = "AUTO")
ggarrange(plotlist = lst2, align = "hv", common.legend = T, labels = "AUTO")There are more LECs in the sample B but LECs form more distinct chain-like structures in sample A.
Here we have added our annotation lines in to the plot that highlight the borders of different tissue margins.
More analysis examples will be added later!




