Skip to content

Commit 1a6dc5e

Browse files
Start preparing release for new models (#1061)
* Start preparing release for new models * Add dataset and bioimageio docs and bump model version * Bump diplomatic bug and noisy ox checksums * Bump ideaslitc-rat and humorous-crab checksums * Bump faithful-chicken and greedy-whale checksums * Bump diplomatic-bug model version * Bump more models * Bump all models and add model download in info cli * Update model download function * Revert vit_t models to fix CI * Update doc/bioimageio/em_organelles_v4.md --------- Co-authored-by: Constantin Pape <[email protected]>
1 parent 87ce846 commit 1a6dc5e

File tree

8 files changed

+222
-53
lines changed

8 files changed

+222
-53
lines changed

doc/bioimageio/em_organelles_v4.md

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
# Segment Anything for Electron Microscopy
2+
3+
This is a [Segment Anything](https://segment-anything.com/) model that was specialized for segmenting mitochondria and nuclei in electron microscopy with [micro_sam](https://github.com/computational-cell-analytics/micro-sam).
4+
This model uses a %s vision transformer as image encoder.
5+
6+
Segment Anything is a model for interactive and automatic instance segmentation.
7+
We improve it for electron microscopy by finetuning on a large and diverse microscopy dataset.
8+
It should perform well for segmenting mitochondria and nuclei in electron microscopy. It can also work well for other organelles, but was not explicitly trained for this purpose. You may get better results for other organelles (e.g. ER or Golgi) with the default Segment Anything models.
9+
10+
See [the dataset overview](https://github.com/computational-cell-analytics/micro-sam/blob/master/doc/datasets/em_organelles_v%i.md) for further informations on the training data and the [micro_sam documentation](https://computational-cell-analytics.github.io/micro-sam/micro_sam.html) for details on how to use the model for interactive and automatic segmentation.
11+
12+
NOTE: The model's automatic instance segmentation quality has improved as the latest version updates the segmentation decoder architecture by replacing transposed convolutions with upsampling.
13+
14+
15+
## Validation
16+
17+
The easiest way to validate the model is to visually check the segmentation quality for your data.
18+
If you have annotations you can use for validation you can also quantitative validation, see [here for details](https://computational-cell-analytics.github.io/micro-sam/micro_sam.html#9-how-can-i-evaluate-a-model-i-have-finetuned).
19+
Please note that the required quality for segmentation always depends on the analysis task you want to solve.

doc/bioimageio/lm_v4.md

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
# Segment Anything for Light Microscopy
2+
3+
This is a [Segment Anything](https://segment-anything.com/) model that was specialized for light microscopy with [micro_sam](https://github.com/computational-cell-analytics/micro-sam).
4+
This model uses a %s vision transformer as image encoder.
5+
6+
Segment Anything is a model for interactive and automatic instance segmentation.
7+
We improve it for light microscopy by finetuning on a large and diverse microscopy dataset.
8+
It should perform well for cell and nucleus segmentation in fluorescent, label-free and other light microscopy datasets.
9+
10+
See [the dataset overview](https://github.com/computational-cell-analytics/micro-sam/blob/master/doc/datasets/lm_v%i.md) for further informations on the training data and the [micro_sam documentation](https://computational-cell-analytics.github.io/micro-sam/micro_sam.html) for details on how to use the model for interactive and automatic segmentation.
11+
12+
NOTE: The model's automatic instance segmentation quality has improved as the latest version updates the segmentation decoder architecture by replacing transposed convolutions with upsampling.
13+
14+
15+
## Validation
16+
17+
The easiest way to validate the model is to visually check the segmentation quality for your data.
18+
If you have annotations you can use for validation you can also quantitative validation, see [here for details](https://computational-cell-analytics.github.io/micro-sam/micro_sam.html#9-how-can-i-evaluate-a-model-i-have-finetuned).
19+
Please note that the required quality for segmentation always depends on the analysis task you want to solve.

doc/datasets/em_organelles_v4.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
# Electron Microscopy Datasets
2+
3+
The `EM Organelle v4` model was trained on three different electron microscopy datasets with segmentation annotations for mitochondria and nuclei:
4+
5+
1. [MitoEM](https://mitoem.grand-challenge.org/): containing segmentation annotations for mitochondria in volume EM of human and rat cortex.
6+
2. [MitoLab](https://www.ebi.ac.uk/empiar/EMPIAR-11037/): containing segmentation annotations for mitochondria in different EM modalities.
7+
3. [Platynereis (Nuclei)](https://zenodo.org/records/3675220): contining segmentation annotations for nuclei in a blockface EM volume of *P. dumerilii*.

doc/datasets/lm_v4.md

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
# Light Microscopy Datasets
2+
3+
The `LM Generalist v4` model was trained on 14 different light microscopy datasets with segmentation annotations for cells and nuclei:
4+
5+
1. [LIVECell](https://sartorius-research.github.io/LIVECell/): containing cell segmentation annotations for phase-contrast microscopy.
6+
2. [DeepBacs](https://github.com/HenriquesLab/DeepBacs): containing segmentation annotations for bacterial cells in different label-free microscopy modalities.
7+
3. [TissueNet](https://datasets.deepcell.org/): containing cell segmentation annotations in tissues imaged with fluorescence light microscopy.
8+
4. [PlantSeg (Root)](https://osf.io/2rszy/): containing cell segmentation annotations in plant roots imaged with fluorescence lightsheet microscopy.
9+
5. [NeurIPS CellSeg](https://neurips22-cellseg.grand-challenge.org/): containg cell segmentation annotations in phase-contrast, brightfield, DIC and fluorescence microscopy.
10+
6. [CTC (Cell Tracking Challenge)](https://celltrackingchallenge.net/2d-datasets/): containing cell segmentation annotations in different label-free and fluorescence microscopy settings. We make use of the following CTC datasets: `BF-C2DL-HSC`, `BF-C2DL-MuSC`, `DIC-C2DH-HeLa`, `Fluo-C2DL-Huh7`, `Fluo-C2DL-MSC`, `Fluo-N2DH-SIM+`, `PhC-C2DH-U373`, `PhC-C2DL-PSC"`]
11+
7. [DSB Nucleus Segmentation](https://www.kaggle.com/c/data-science-bowl-2018): containing nucleus segmentation annotations in fluorescence microscopy. We make use of [this subset](https://github.com/stardist/stardist/releases/download/0.1.0/dsb2018.zip) of the data.
12+
8. [EmbedSeg](https://github.com/juglab/EmbedSeg): containing cell and nuclei annotations in fluorescence microcsopy.
13+
9. [YeaZ](https://www.epfl.ch/labs/lpbs/data-and-software): containing segmentation annotations for yeast cells in phase contrast and brightfield microscopy.
14+
10. [CVZ Fluo](https://www.synapse.org/Synapse:syn27624812/): containing cell and nuclei annotations in fluorescence microsocopy.
15+
11. [DynamicNuclearNet](https://datasets.deepcell.org/): containing nuclei annotations in fluorescence microscopy.
16+
12. [CellPose](https://www.cellpose.org/): containing cell annotations in fluorescence microscopy.
17+
13. [OmniPose](https://osf.io/xmury/): containing segmentation annotations for bacterial cells in phase-contrast and fluorescence microscopy, and worms in brightfield microscopy.
18+
14. [OrgaSegment](https://zenodo.org/records/10278229): containing segmentation annotations for organoids in brightfield microscopy.

micro_sam/util.py

Lines changed: 89 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -105,12 +105,12 @@ def models():
105105
"vit_t": "xxh128:8eadbc88aeb9d8c7e0b4b60c3db48bd0",
106106
# The current version of our models in the modelzoo.
107107
# LM generalist models:
108-
"vit_l_lm": "xxh128:fc32ea6f7fcc7eb02737d1304f81f5f2",
109-
"vit_b_lm": "xxh128:8fd5806be3c3ba213e19a709d6d1495f",
108+
"vit_l_lm": "xxh128:017f20677997d628426dec80a8018f9d",
109+
"vit_b_lm": "xxh128:fe9252a29f3f4ea53c15a06de471e186",
110110
"vit_t_lm": "xxh128:72ec5074774761a6e5c05a08942f981e",
111111
# EM models:
112-
"vit_l_em_organelles": "xxh128:096c9695966803ca6fde24f4c1e3c3fb",
113-
"vit_b_em_organelles": "xxh128:f6f6593aeecd0e15a07bdac86360b6cc",
112+
"vit_l_em_organelles": "xxh128:810b084b6e51acdbf760a993d8619f2d",
113+
"vit_b_em_organelles": "xxh128:f3bf2ed83d691456bae2c3f9a05fb438",
114114
"vit_t_em_organelles": "xxh128:253474720c497cce605e57c9b1d18fd9",
115115
# Histopathology models:
116116
"vit_b_histopathology": "xxh128:ffd1a2cd84570458b257bd95fdd8f974",
@@ -122,12 +122,12 @@ def models():
122122
# Additional decoders for instance segmentation.
123123
decoder_registry = {
124124
# LM generalist models:
125-
"vit_l_lm_decoder": "xxh128:779b5a50ecc6d46d495753fba8717f2f",
126-
"vit_b_lm_decoder": "xxh128:9f580a96984b3085389ced5d9a4ae75d",
125+
"vit_l_lm_decoder": "xxh128:2faeafa03819dfe03e7c46a44aaac64a",
126+
"vit_b_lm_decoder": "xxh128:708b15ac620e235f90bb38612c4929ba",
127127
"vit_t_lm_decoder": "xxh128:3e914a5f397b0312cdd36813031f8823",
128128
# EM models:
129-
"vit_l_em_organelles_decoder": "xxh128:d60fd96bd6060856f6430f29e42568fb",
130-
"vit_b_em_organelles_decoder": "xxh128:b2d4dcffb99f76d83497d39ee500088f",
129+
"vit_l_em_organelles_decoder": "xxh128:334877640bfdaaabce533e3252a17294",
130+
"vit_b_em_organelles_decoder": "xxh128:bb6398956a6b0132c26b631c14f95ce2",
131131
"vit_t_em_organelles_decoder": "xxh128:8f897c7bb93174a4d1638827c4dd6f44",
132132
# Histopathology models:
133133
"vit_b_histopathology_decoder": "xxh128:6a66194dcb6e36199cbee2214ecf7213",
@@ -141,11 +141,11 @@ def models():
141141
"vit_h": "https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth",
142142
"vit_b": "https://dl.fbaipublicfiles.com/segment_anything/sam_vit_b_01ec64.pth",
143143
"vit_t": "https://owncloud.gwdg.de/index.php/s/TuDzuwVDHd1ZDnQ/download",
144-
"vit_l_lm": "https://uk1s3.embassy.ebi.ac.uk/public-datasets/bioimage.io/idealistic-rat/1.1/files/vit_l.pt",
145-
"vit_b_lm": "https://uk1s3.embassy.ebi.ac.uk/public-datasets/bioimage.io/diplomatic-bug/1.1/files/vit_b.pt",
144+
"vit_l_lm": "https://uk1s3.embassy.ebi.ac.uk/public-datasets/bioimage.io/idealistic-rat/1.2/files/vit_l.pt",
145+
"vit_b_lm": "https://uk1s3.embassy.ebi.ac.uk/public-datasets/bioimage.io/diplomatic-bug/1.2/files/vit_b.pt",
146146
"vit_t_lm": "https://uk1s3.embassy.ebi.ac.uk/public-datasets/bioimage.io/faithful-chicken/1.1/files/vit_t.pt",
147-
"vit_l_em_organelles": "https://uk1s3.embassy.ebi.ac.uk/public-datasets/bioimage.io/humorous-crab/1/files/vit_l.pt", # noqa
148-
"vit_b_em_organelles": "https://uk1s3.embassy.ebi.ac.uk/public-datasets/bioimage.io/noisy-ox/1/files/vit_b.pt",
147+
"vit_l_em_organelles": "https://uk1s3.embassy.ebi.ac.uk/public-datasets/bioimage.io/humorous-crab/1.2/files/vit_l.pt", # noqa
148+
"vit_b_em_organelles": "https://uk1s3.embassy.ebi.ac.uk/public-datasets/bioimage.io/noisy-ox/1.2/files/vit_b.pt", # noqa
149149
"vit_t_em_organelles": "https://uk1s3.embassy.ebi.ac.uk/public-datasets/bioimage.io/greedy-whale/1/files/vit_t.pt", # noqa
150150
"vit_b_histopathology": "https://owncloud.gwdg.de/index.php/s/sBB4H8CTmIoBZsQ/download",
151151
"vit_l_histopathology": "https://owncloud.gwdg.de/index.php/s/IZgnn1cpBq2PHod/download",
@@ -154,11 +154,11 @@ def models():
154154
}
155155

156156
decoder_urls = {
157-
"vit_l_lm_decoder": "https://uk1s3.embassy.ebi.ac.uk/public-datasets/bioimage.io/idealistic-rat/1.1/files/vit_l_decoder.pt", # noqa
158-
"vit_b_lm_decoder": "https://uk1s3.embassy.ebi.ac.uk/public-datasets/bioimage.io/diplomatic-bug/1.1/files/vit_b_decoder.pt", # noqa
157+
"vit_l_lm_decoder": "https://uk1s3.embassy.ebi.ac.uk/public-datasets/bioimage.io/idealistic-rat/1.2/files/vit_l_decoder.pt", # noqa
158+
"vit_b_lm_decoder": "https://uk1s3.embassy.ebi.ac.uk/public-datasets/bioimage.io/diplomatic-bug/1.2/files/vit_b_decoder.pt", # noqa
159159
"vit_t_lm_decoder": "https://uk1s3.embassy.ebi.ac.uk/public-datasets/bioimage.io/faithful-chicken/1.1/files/vit_t_decoder.pt", # noqa
160-
"vit_l_em_organelles_decoder": "https://uk1s3.embassy.ebi.ac.uk/public-datasets/bioimage.io/humorous-crab/1/files/vit_l_decoder.pt", # noqa
161-
"vit_b_em_organelles_decoder": "https://uk1s3.embassy.ebi.ac.uk/public-datasets/bioimage.io/noisy-ox/1/files/vit_b_decoder.pt", # noqa
160+
"vit_l_em_organelles_decoder": "https://uk1s3.embassy.ebi.ac.uk/public-datasets/bioimage.io/humorous-crab/1.2/files/vit_l_decoder.pt", # noqa
161+
"vit_b_em_organelles_decoder": "https://uk1s3.embassy.ebi.ac.uk/public-datasets/bioimage.io/noisy-ox/1.2/files/vit_b_decoder.pt", # noqa
162162
"vit_t_em_organelles_decoder": "https://uk1s3.embassy.ebi.ac.uk/public-datasets/bioimage.io/greedy-whale/1/files/vit_t_decoder.pt", # noqa
163163
"vit_b_histopathology_decoder": "https://owncloud.gwdg.de/index.php/s/KO9AWqynI7SFOBj/download",
164164
"vit_l_histopathology_decoder": "https://owncloud.gwdg.de/index.php/s/oIs6VSmkOp7XrKF/download",
@@ -283,6 +283,31 @@ def _load_checkpoint(checkpoint_path):
283283
return state, model_state
284284

285285

286+
def _download_sam_model(model_type, progress_bar_factory=None):
287+
model_registry = models()
288+
289+
progress_bar = True
290+
# Check if we have to download the model.
291+
# If we do and have a progress bar factory, then we over-write the progress bar.
292+
if not os.path.exists(os.path.join(get_cache_directory(), model_type)) and progress_bar_factory is not None:
293+
progress_bar = progress_bar_factory(model_type)
294+
295+
checkpoint_path = model_registry.fetch(model_type, progressbar=progress_bar)
296+
if not isinstance(progress_bar, bool): # Close the progress bar when the task finishes.
297+
progress_bar.close()
298+
299+
model_hash = model_registry.registry[model_type]
300+
301+
# If we have a custom model then we may also have a decoder checkpoint.
302+
# Download it here, so that we can add it to the state.
303+
decoder_name = f"{model_type}_decoder"
304+
decoder_path = model_registry.fetch(
305+
decoder_name, progressbar=True
306+
) if decoder_name in model_registry.registry else None
307+
308+
return checkpoint_path, model_hash, decoder_path
309+
310+
286311
def get_sam_model(
287312
model_type: str = _DEFAULT_MODEL,
288313
device: Optional[Union[str, torch.device]] = None,
@@ -345,26 +370,7 @@ def get_sam_model(
345370
# URL from the model_type. If the model_type is invalid pooch will raise an error.
346371
_provided_checkpoint_path = checkpoint_path is not None
347372
if checkpoint_path is None:
348-
model_registry = models()
349-
350-
progress_bar = True
351-
# Check if we have to download the model.
352-
# If we do and have a progress bar factory, then we over-write the progress bar.
353-
if not os.path.exists(os.path.join(get_cache_directory(), model_type)) and progress_bar_factory is not None:
354-
progress_bar = progress_bar_factory(model_type)
355-
356-
checkpoint_path = model_registry.fetch(model_type, progressbar=progress_bar)
357-
if not isinstance(progress_bar, bool): # Close the progress bar when the task finishes.
358-
progress_bar.close()
359-
360-
model_hash = model_registry.registry[model_type]
361-
362-
# If we have a custom model then we may also have a decoder checkpoint.
363-
# Download it here, so that we can add it to the state.
364-
decoder_name = f"{model_type}_decoder"
365-
decoder_path = model_registry.fetch(
366-
decoder_name, progressbar=True
367-
) if decoder_name in model_registry.registry else None
373+
checkpoint_path, model_hash, decoder_path = _download_sam_model(model_type, progress_bar_factory)
368374

369375
# checkpoint_path has been passed, we use it instead of downloading a model.
370376
else:
@@ -1259,13 +1265,25 @@ def micro_sam_info():
12591265
"""Display μSAM information using a rich console."""
12601266
import psutil
12611267
import platform
1268+
import argparse
1269+
from rich import progress
12621270
from rich.panel import Panel
12631271
from rich.table import Table
12641272
from rich.console import Console
12651273

12661274
import torch
12671275
import micro_sam
12681276

1277+
parser = argparse.ArgumentParser(description="μSAM Information Booth")
1278+
parser.add_argument(
1279+
"--download", nargs="+", metavar=("WHAT", "KIND"),
1280+
help="Downloads the pretrained SAM models."
1281+
"'--download models' -> downloads all pretrained models; "
1282+
"'--download models vit_b_lm vit_b_em_organelles' -> downloads the listed models; "
1283+
"'--download model/models vit_b_lm' -> downloads a single specified model."
1284+
)
1285+
args = parser.parse_args()
1286+
12691287
# Open up a new console.
12701288
console = Console()
12711289

@@ -1339,3 +1357,38 @@ def micro_sam_info():
13391357
title="Device Information"
13401358
)
13411359
)
1360+
1361+
# The section allowing to download models.
1362+
# NOTE: In future, can be extended to download sample data.
1363+
if args.download:
1364+
download_provided_args = [t.lower() for t in args.download]
1365+
mode, *model_types = download_provided_args
1366+
1367+
if mode not in {"models", "model"}:
1368+
console.print(f"[red]Unknown option for --download: {mode}[/]")
1369+
return
1370+
1371+
if mode in ["model", "models"] and not model_types: # If user did not specify, we will download all models.
1372+
download_list = available_models
1373+
else:
1374+
download_list = model_types
1375+
incorrect_models = [m for m in download_list if m not in available_models]
1376+
if incorrect_models:
1377+
console.print(Panel("[red]Unknown model(s):[/] " + ", ".join(incorrect_models), title="Download Error"))
1378+
return
1379+
1380+
with progress.Progress(
1381+
progress.SpinnerColumn(),
1382+
progress.TextColumn("[progress.description]{task.description}"),
1383+
progress.BarColumn(bar_width=None),
1384+
"[progress.percentage]{task.percentage:>3.0f}%",
1385+
progress.TimeRemainingColumn(),
1386+
console=console,
1387+
) as prog:
1388+
task = prog.add_task("[green]Downloading μSAM models…", total=len(download_list))
1389+
for model_type in download_list:
1390+
prog.update(task, description=f"Downloading [cyan]{model_type}[/]…")
1391+
_download_sam_model(model_type=model_type)
1392+
prog.advance(task)
1393+
1394+
console.print(Panel("[bold green] Downloads complete![/]", title="Finished"))

scripts/model_export/export_models.py

Lines changed: 14 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@
2222

2323
INPUT_FOLDER = "/media/anwai/ANWAI/models/micro_sam"
2424
OUTPUT_FOLDER = "./exported_models"
25-
BIOIMAGEIO_VERSION = 1.1 # version marked for v3 (LM) Generalist Models
25+
BIOIMAGEIO_VERSION = 1.2 # version marked for v4 LM and EM-Organelles Generalist Models
2626

2727

2828
def create_doc(model_type, modality, version):
@@ -131,13 +131,18 @@ def export_model(model_path, model_type, modality, version, email):
131131
print("Decoder:")
132132
print(f"{model_name}_decoder", f"xxh128:{decoder_checksum}")
133133

134+
breakpoint()
134135

135-
def export_all_models(email, version):
136-
models = glob(os.path.join(INPUT_FOLDER, f"v{version}/**/vit*"), recursive=True)
136+
137+
def export_all_models(email, version, model_type):
138+
if model_type is None:
139+
model_type = "vit*"
140+
141+
models = glob(os.path.join(INPUT_FOLDER, f"v{version}/**/{model_type}"), recursive=True)
137142
for path in models:
138-
modality, _, model_type = path.split("/")[-3:] # current expected structure: v3/lm/generalist/vit_b/best.pt
139-
# print(model_path, modality, model_type)
143+
modality, _, model_type = path.split("/")[-3:] # current expected structure: v4/lm/generalist/vit_b/best.pt
140144
model_path = os.path.join(path, "best.pt")
145+
print(model_path, modality, model_type)
141146
assert os.path.exists(model_path), model_path
142147
export_model(model_path, model_type, modality, version=version, email=email)
143148

@@ -146,16 +151,17 @@ def export_all_models(email, version):
146151
def export_vit_t_lm(email):
147152
model_type = "vit_t"
148153
model_path = os.path.join(INPUT_FOLDER, "lm", "generalist", model_type, "best.pt")
149-
export_model(model_path, model_type, "lm", version=3, email=email)
154+
export_model(model_path, model_type, "lm", version=4, email=email)
150155

151156

152157
def main():
153158
parser = argparse.ArgumentParser()
154159
parser.add_argument("-e", "--email", required=True)
155-
parser.add_argument("-v", "--version", default=3, type=int)
160+
parser.add_argument("-v", "--version", default=4, type=int)
161+
parser.add_argument("-m", "--model_type", type=str, default=None)
156162
args = parser.parse_args()
157163

158-
export_all_models(args.email, args.version)
164+
export_all_models(args.email, args.version, args.model_type)
159165

160166

161167
if __name__ == "__main__":

scripts/model_export/models.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@
55
import requests
66
import yaml
77

8+
89
ADDJECTIVE_URL = "https://raw.githubusercontent.com/bioimage-io/collection-bioimage-io/main/adjectives.txt"
910
ANIMAL_URL = "https://raw.githubusercontent.com/bioimage-io/collection-bioimage-io/main/animals.yaml"
1011
COLLECTION_URL = "https://raw.githubusercontent.com/bioimage-io/collection-bioimage-io/gh-pages/collection.json"

0 commit comments

Comments
 (0)