Skip to content

Commit 102c4a4

Browse files
Update to new finetuned models (#326)
1 parent 3fda1d9 commit 102c4a4

File tree

10 files changed

+157
-60
lines changed

10 files changed

+157
-60
lines changed

doc/finetuned_models.md

Lines changed: 26 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,14 @@
11
# Finetuned models
22

3-
We provide models that were finetuned on microscopy data using `micro_sam.training`. They are hosted on zenodo. We currently offer the following models:
3+
In addition to the original Segment anything models, we provide models that finetuned on microscopy data using the functionality from `micro_sam.training`.
4+
The models are hosted on zenodo. We currently offer the following models:
45
- `vit_h`: Default Segment Anything model with vit-h backbone.
56
- `vit_l`: Default Segment Anything model with vit-l backbone.
67
- `vit_b`: Default Segment Anything model with vit-b backbone.
7-
- `vit_h_lm`: Finetuned Segment Anything model for cells and nuclei in light microscopy data with vit-h backbone.
8+
- `vit_t`: Segment Anything model with vit-tiny backbone. From the [mobile sam publication](https://arxiv.org/abs/2306.14289).
89
- `vit_b_lm`: Finetuned Segment Anything model for cells and nuclei in light microscopy data with vit-b backbone.
9-
- `vit_h_em`: Finetuned Segment Anything model for neurites and cells in electron microscopy data with vit-h backbone.
10-
- `vit_b_em`: Finetuned Segment Anything model for neurites and cells in electron microscopy data with vit-b backbone.
10+
- `vit_b_em_organelles`: Finetuned Segment Anything model for mitochodria and nuclei in electron microscopy data with vit-b backbone.
11+
- `vit_b_em_boundaries`: Finetuned Segment Anything model for neurites and cells in electron microscopy data with vit-b backbone.
1112

1213
See the two figures below of the improvements through the finetuned model for LM and EM data.
1314

@@ -20,17 +21,32 @@ You can select which of the models is used in the annotation tools by selecting
2021
<img src="https://raw.githubusercontent.com/computational-cell-analytics/micro-sam/master/doc/images/model-type-selector.png" width="256">
2122

2223
To use a specific model in the python library you need to pass the corresponding name as value to the `model_type` parameter exposed by all relevant functions.
23-
See for example the [2d annotator example](https://github.com/computational-cell-analytics/micro-sam/blob/master/examples/annotator_2d.py#L62) where `use_finetuned_model` can be set to `True` to use the `vit_h_lm` model.
24+
See for example the [2d annotator example](https://github.com/computational-cell-analytics/micro-sam/blob/master/examples/annotator_2d.py#L62) where `use_finetuned_model` can be set to `True` to use the `vit_b_lm` model.
25+
26+
Note that we are still working on improving these models and may update them from time to time. All older models will stay available for download on zenodo, see [model sources](#model-sources) below
27+
2428

2529
## Which model should I choose?
2630

2731
As a rule of thumb:
28-
- Use the `_lm` models for segmenting cells or nuclei in light microscopy.
29-
- Use the `_em` models for segmenting cells or neurites in electron microscopy.
30-
- Note that this model does not work well for segmenting mitochondria or other organelles because it is biased towards segmenting the full cell / cellular compartment.
31-
- For other cases use the default models.
32+
- Use the `vit_b_lm` model for segmenting cells or nuclei in light microscopy.
33+
- Use the `vit_b_em_organelles` models for segmenting mitochondria, nuclei or other organelles in electron microscopy.
34+
- Use the `vit_b_em_boundaries` models for segmenting cells or neurites in electron microscopy.
35+
- For other use-cases use one of the default models.
3236

3337
See also the figures above for examples where the finetuned models work better than the vanilla models.
3438
Currently the model `vit_h` is used by default.
3539

36-
We are working on releasing more fine-tuned models, in particular for mitochondria and other organelles in EM.
40+
We are working on further improving these models and adding new models for other biomedical imaging domains.
41+
42+
43+
## Model Sources
44+
45+
Here is an overview of all finetuned models we have released to zenodo so far:
46+
- [vit_b_em_boundaries](https://zenodo.org/records/10524894): for segmenting compartments delineated by boundaries such as cells or neurites in EM.
47+
- [vit_b_em_organelles](https://zenodo.org/records/10524828): for segmenting mitochondria, nuclei or other organelles in EM.
48+
- [vit_b_lm](https://zenodo.org/records/10524791): for segmenting cells and nuclei in LM.
49+
- [vit_h_em](https://zenodo.org/records/8250291): this model is outdated.
50+
- [vit_h_lm](https://zenodo.org/records/8250299): this model is outdated.
51+
52+
Some of these models contain multiple versions.

doc/images/model-type-selector.png

68 KB
Loading

examples/annotator_2d.py

Lines changed: 10 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -19,8 +19,8 @@ def livecell_annotator(use_finetuned_model):
1919
image = imageio.imread(example_data)
2020

2121
if use_finetuned_model:
22-
embedding_path = os.path.join(EMBEDDING_CACHE, "embeddings-livecell-vit_h_lm.zarr")
23-
model_type = "vit_h_lm"
22+
embedding_path = os.path.join(EMBEDDING_CACHE, "embeddings-livecell-vit_b_lm.zarr")
23+
model_type = "vit_b_lm"
2424
else:
2525
embedding_path = os.path.join(EMBEDDING_CACHE, "embeddings-livecell.zarr")
2626
model_type = "vit_h"
@@ -35,8 +35,8 @@ def hela_2d_annotator(use_finetuned_model):
3535
image = imageio.imread(example_data)
3636

3737
if use_finetuned_model:
38-
embedding_path = os.path.join(EMBEDDING_CACHE, "embeddings-hela2d-vit_h_lm.zarr")
39-
model_type = "vit_h_lm"
38+
embedding_path = os.path.join(EMBEDDING_CACHE, "embeddings-hela2d-vit_b_lm.zarr")
39+
model_type = "vit_b_lm"
4040
else:
4141
embedding_path = os.path.join(EMBEDDING_CACHE, "embeddings-hela2d.zarr")
4242
model_type = "vit_h"
@@ -54,8 +54,8 @@ def wholeslide_annotator(use_finetuned_model):
5454
image = imageio.imread(example_data)
5555

5656
if use_finetuned_model:
57-
embedding_path = os.path.join(EMBEDDING_CACHE, "whole-slide-embeddings-vit_h_lm.zarr")
58-
model_type = "vit_h_lm"
57+
embedding_path = os.path.join(EMBEDDING_CACHE, "whole-slide-embeddings-vit_b_lm.zarr")
58+
model_type = "vit_b_lm"
5959
else:
6060
embedding_path = os.path.join(EMBEDDING_CACHE, "whole-slide-embeddings.zarr")
6161
model_type = "vit_h"
@@ -64,15 +64,14 @@ def wholeslide_annotator(use_finetuned_model):
6464

6565

6666
def main():
67-
# whether to use the fine-tuned SAM model
68-
# this feature is still experimental!
69-
use_finetuned_model = False
67+
# Whether to use the fine-tuned SAM model for light microscopy data.
68+
use_finetuned_model = True
7069

7170
# 2d annotator for livecell data
72-
# livecell_annotator(use_finetuned_model)
71+
livecell_annotator(use_finetuned_model)
7372

7473
# 2d annotator for cell tracking challenge hela data
75-
hela_2d_annotator(use_finetuned_model)
74+
# hela_2d_annotator(use_finetuned_model)
7675

7776
# 2d annotator for a whole slide image
7877
# wholeslide_annotator(use_finetuned_model)

examples/annotator_3d.py

Lines changed: 16 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -10,31 +10,36 @@
1010
os.makedirs(EMBEDDING_CACHE, exist_ok=True)
1111

1212

13-
def em_3d_annotator(use_finetuned_model):
13+
def em_3d_annotator(finetuned_model):
1414
"""Run the 3d annotator for an example EM volume."""
1515
# download the example data
1616
example_data = fetch_3d_example_data(DATA_CACHE)
1717
# load the example data (load the sequence of tif files as 3d volume)
1818
with open_file(example_data) as f:
1919
raw = f["*.png"][:]
2020

21-
if use_finetuned_model:
22-
embedding_path = os.path.join(EMBEDDING_CACHE, "embeddings-lucchi-vit_h_em.zarr")
23-
model_type = "vit_h_em"
24-
else:
21+
if not finetuned_model:
2522
embedding_path = os.path.join(EMBEDDING_CACHE, "embeddings-lucchi.zarr")
2623
model_type = "vit_h"
24+
else:
25+
assert finetuned_model in ("organelles", "boundaries")
26+
embedding_path = os.path.join(EMBEDDING_CACHE, f"embeddings-lucchi-vit_b_em_{finetuned_model}.zarr")
27+
model_type = f"vit_b_em_{finetuned_model}"
28+
print(embedding_path)
2729

2830
# start the annotator, cache the embeddings
29-
annotator_3d(raw, embedding_path, model_type=model_type, show_embeddings=False)
31+
annotator_3d(raw, embedding_path, model_type=model_type)
3032

3133

3234
def main():
33-
# whether to use the fine-tuned SAM model
34-
# this feature is still experimental!
35-
use_finetuned_model = False
36-
37-
em_3d_annotator(use_finetuned_model)
35+
# Whether to use the fine-tuned SAM model for mitochondria (organelles) or boundaries.
36+
# valid choices are:
37+
# - None / False (will use the vanilla model)
38+
# - "organelles": will use the model for mitochondria and other organelles
39+
# - "boundaries": will use the model for boundary based structures
40+
finetuned_model = "boundaries"
41+
42+
em_3d_annotator(finetuned_model)
3843

3944

4045
if __name__ == "__main__":

examples/annotator_tracking.py

Lines changed: 5 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -20,20 +20,19 @@ def track_ctc_data(use_finetuned_model):
2020
timeseries = f["*.tif"]
2121

2222
if use_finetuned_model:
23-
embedding_path = os.path.join(EMBEDDING_CACHE, "embeddings-ctc-vit_h_lm.zarr")
24-
model_type = "vit_h_lm"
23+
embedding_path = os.path.join(EMBEDDING_CACHE, "embeddings-ctc-vit_b_lm.zarr")
24+
model_type = "vit_b_lm"
2525
else:
2626
embedding_path = os.path.join(EMBEDDING_CACHE, "embeddings-ctc.zarr")
2727
model_type = "vit_h"
2828

2929
# start the annotator with cached embeddings
30-
annotator_tracking(timeseries, embedding_path=embedding_path, show_embeddings=False, model_type=model_type)
30+
annotator_tracking(timeseries, embedding_path=embedding_path, model_type=model_type)
3131

3232

3333
def main():
34-
# whether to use the fine-tuned SAM model
35-
# this feature is still experimental!
36-
use_finetuned_model = False
34+
# Whether to use the fine-tuned SAM model.
35+
use_finetuned_model = True
3736
track_ctc_data(use_finetuned_model)
3837

3938

examples/annotator_with_custom_model.py

Lines changed: 19 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,24 @@
1+
import os
2+
3+
import imageio
14
import h5py
25
import micro_sam.sam_annotator as annotator
6+
37
from micro_sam.util import get_sam_model
8+
from micro_sam.util import get_cache_directory
9+
from micro_sam.sample_data import fetch_hela_2d_example_data
10+
11+
12+
DATA_CACHE = os.path.join(get_cache_directory(), "sample_data")
13+
14+
15+
def annotator_2d_with_custom_model():
16+
example_data = fetch_hela_2d_example_data(DATA_CACHE)
17+
image = imageio.imread(example_data)
418

5-
# TODO add an example for the 2d annotator with a custom model
19+
custom_model = "/home/pape/Downloads/exported_models/vit_b_lm.pth"
20+
predictor = get_sam_model(checkpoint_path=custom_model, model_type="vit_b")
21+
annotator.annotator_2d(image, predictor=predictor)
622

723

824
def annotator_3d_with_custom_model():
@@ -16,7 +32,8 @@ def annotator_3d_with_custom_model():
1632

1733

1834
def main():
19-
annotator_3d_with_custom_model()
35+
annotator_2d_with_custom_model()
36+
# annotator_3d_with_custom_model()
2037

2138

2239
if __name__ == "__main__":

examples/image_series_annotator.py

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -14,8 +14,8 @@ def series_annotation(use_finetuned_model):
1414
"""
1515

1616
if use_finetuned_model:
17-
embedding_path = os.path.join(EMBEDDING_CACHE, "series-embeddings-vit_h_lm")
18-
model_type = "vit_h_lm"
17+
embedding_path = os.path.join(EMBEDDING_CACHE, "series-embeddings-vit_b_lm")
18+
model_type = "vit_b_lm"
1919
else:
2020
embedding_path = os.path.join(EMBEDDING_CACHE, "series-embeddings")
2121
model_type = "vit_h"
@@ -29,8 +29,7 @@ def series_annotation(use_finetuned_model):
2929

3030

3131
def main():
32-
# whether to use the fine-tuned SAM model
33-
# this feature is still experimental!
32+
# Whether to use the fine-tuned SAM model.
3433
use_finetuned_model = False
3534
series_annotation(use_finetuned_model)
3635

micro_sam/util.py

Lines changed: 14 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -60,24 +60,25 @@ def get_cache_directory() -> None:
6060
6161
Users can set the MICROSAM_CACHEDIR environment variable for a custom cache directory.
6262
"""
63-
default_cache_directory = os.path.expanduser(pooch.os_cache('micro_sam'))
64-
cache_directory = Path(os.environ.get('MICROSAM_CACHEDIR', default_cache_directory))
63+
default_cache_directory = os.path.expanduser(pooch.os_cache("micro_sam"))
64+
cache_directory = Path(os.environ.get("MICROSAM_CACHEDIR", default_cache_directory))
6565
return cache_directory
6666

6767

6868
#
6969
# Functionality for model download and export
7070
#
7171

72-
def microsam_cachedir():
72+
73+
def microsam_cachedir() -> None:
7374
"""Return the micro-sam cache directory.
7475
7576
Returns the top level cache directory for micro-sam models and sample data.
7677
7778
Every time this function is called, we check for any user updates made to
7879
the MICROSAM_CACHEDIR os environment variable since the last time.
7980
"""
80-
cache_directory = os.environ.get('MICROSAM_CACHEDIR') or pooch.os_cache('micro_sam')
81+
cache_directory = os.environ.get("MICROSAM_CACHEDIR") or pooch.os_cache("micro_sam")
8182
return cache_directory
8283

8384

@@ -106,10 +107,9 @@ def models():
106107
# the model with vit tiny backend fom https://github.com/ChaoningZhang/MobileSAM
107108
"vit_t": "sha256:6dbb90523a35330fedd7f1d3dfc66f995213d81b29a5ca8108dbcdd4e37d6c2f",
108109
# first version of finetuned models on zenodo
109-
"vit_h_lm": "sha256:9a65ee0cddc05a98d60469a12a058859c89dc3ea3ba39fed9b90d786253fbf26",
110-
"vit_b_lm": "sha256:5a59cc4064092d54cd4d92cd967e39168f3760905431e868e474d60fe5464ecd",
111-
"vit_h_em": "sha256:ae3798a0646c8df1d4db147998a2d37e402ff57d3aa4e571792fbb911d8a979c",
112-
"vit_b_em": "sha256:c04a714a4e14a110f0eec055a65f7409d54e6bf733164d2933a0ce556f7d6f81",
110+
"vit_b_lm": "sha256:e8f5feb1ad837a7507935409c7f83f7c8af11c6e39cfe3df03f8d3bd4a358449",
111+
"vit_b_em_organelles": "sha256:8fabbe38a427a0c91bbe6518a5c0f103f36b73e6ee6c86fbacd32b4fc66294b4",
112+
"vit_b_em_boundaries": "sha256:d87348b2adef30ab427fb787d458643300eb30624a0e808bf36af21764705f4f",
113113
}
114114
registry_xxh128 = {
115115
# the default segment anything models
@@ -119,10 +119,9 @@ def models():
119119
# the model with vit tiny backend fom https://github.com/ChaoningZhang/MobileSAM
120120
"vit_t": "xxh128:8eadbc88aeb9d8c7e0b4b60c3db48bd0",
121121
# first version of finetuned models on zenodo
122-
"vit_h_lm": "xxh128:e113adac6a0a21514bb2d73de16b921b",
123-
"vit_b_lm": "xxh128:5fc0851abf8a209dcbed4e95634d9e27",
124-
"vit_h_em": "xxh128:64b6eb2d32ac9c5d9b022b1ac57f1cc6",
125-
"vit_b_em": "xxh128:f50d499db5bf54dc9849c3dbd271d5c9",
122+
"vit_b_lm": "xxh128:6b061eb8684d9d5f55545330d6dce50d",
123+
"vit_b_em_organelles": "xxh128:3919c2b761beba7d3f4ece342c9f5369",
124+
"vit_b_em_boundaries": "xxh128:3099fe6339f5be91ca84db889db1909f",
126125
}
127126

128127
models = pooch.create(
@@ -138,10 +137,9 @@ def models():
138137
# the model with vit tiny backend fom https://github.com/ChaoningZhang/MobileSAM
139138
"vit_t": "https://owncloud.gwdg.de/index.php/s/TuDzuwVDHd1ZDnQ/download",
140139
# first version of finetuned models on zenodo
141-
"vit_h_lm": "https://zenodo.org/record/8250299/files/vit_h_lm.pth?download=1",
142-
"vit_b_lm": "https://zenodo.org/record/8250281/files/vit_b_lm.pth?download=1",
143-
"vit_h_em": "https://zenodo.org/record/8250291/files/vit_h_em.pth?download=1",
144-
"vit_b_em": "https://zenodo.org/record/8250260/files/vit_b_em.pth?download=1",
140+
"vit_b_lm": "https://zenodo.org/records/10524791/files/vit_b_lm.pth?download=1",
141+
"vit_b_em_organelles": "https://zenodo.org/records/10524828/files/vit_b_em_organelles.pth?download=1",
142+
"vit_b_em_boundaries": "https://zenodo.org/records/10524894/files/vit_b_em_boundaries.pth?download=1",
145143
},
146144
)
147145
return models

scripts/.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
new_models/
2+
exported_models/
Lines changed: 62 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,62 @@
1+
"""Helper scripts to export models for upload to zenodo.
2+
"""
3+
4+
import hashlib
5+
import os
6+
import warnings
7+
from glob import glob
8+
9+
import xxhash
10+
from micro_sam.util import export_custom_sam_model
11+
12+
BUF_SIZE = 65536 # lets read stuff in 64kb chunks!
13+
14+
15+
def export_model(model_path, model_type, export_name):
16+
output_folder = "./exported_models"
17+
os.makedirs(output_folder, exist_ok=True)
18+
19+
output_path = os.path.join(output_folder, export_name)
20+
if os.path.exists(output_path):
21+
print("The model", export_name, "has already been exported.")
22+
return
23+
24+
with warnings.catch_warnings():
25+
warnings.simplefilter("ignore")
26+
export_custom_sam_model(
27+
checkpoint_path=model_path,
28+
model_type=model_type,
29+
save_path=output_path,
30+
)
31+
32+
print("Exported", export_name)
33+
34+
sha_checksum = hashlib.sha256()
35+
xxh_checksum = xxhash.xxh128()
36+
37+
with open(output_path, "rb") as f:
38+
while True:
39+
data = f.read(BUF_SIZE)
40+
if not data:
41+
break
42+
sha_checksum.update(data)
43+
xxh_checksum.update(data)
44+
45+
print("sha256:", f"sha256:{sha_checksum.hexdigest()}")
46+
print("xxh128:", f"xxh128:{xxh_checksum.hexdigest()}")
47+
48+
49+
def export_all_models():
50+
models = glob(os.path.join("./new_models/*.pt"))
51+
model_type = "vit_b"
52+
for model_path in models:
53+
export_name = os.path.basename(model_path).replace(".pt", ".pth")
54+
export_model(model_path, model_type, export_name)
55+
56+
57+
def main():
58+
export_all_models()
59+
60+
61+
if __name__ == "__main__":
62+
main()

0 commit comments

Comments
 (0)