Skip to content

Commit 545bd2c

Browse files
committed
Merge branch 'main' into bb_query_napari_plots
2 parents e847ba3 + 02b61f4 commit 545bd2c

File tree

189 files changed

+7069
-18
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

189 files changed

+7069
-18
lines changed

.pre-commit-config.yaml

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -14,11 +14,11 @@ repos:
1414
- id: nbqa-black
1515
- id: nbqa-isort
1616
- repo: https://github.com/psf/black
17-
rev: 23.7.0
17+
rev: 23.9.1
1818
hooks:
1919
- id: black
2020
- repo: https://github.com/pre-commit/mirrors-prettier
21-
rev: v3.0.2
21+
rev: v3.0.3
2222
hooks:
2323
- id: prettier
2424
- repo: https://github.com/asottile/blacken-docs
@@ -51,7 +51,7 @@ repos:
5151
- id: trailing-whitespace
5252
- id: check-case-conflict
5353
- repo: https://github.com/PyCQA/autoflake
54-
rev: v2.2.0
54+
rev: v2.2.1
5555
hooks:
5656
- id: autoflake
5757
args:
@@ -71,7 +71,7 @@ repos:
7171
- flake8-bugbear
7272
- flake8-blind-except
7373
- repo: https://github.com/asottile/pyupgrade
74-
rev: v3.10.1
74+
rev: v3.13.0
7575
hooks:
7676
- id: pyupgrade
7777
args: [--py3-plus, --py38-plus, --keep-runtime-typing]

conf.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -107,6 +107,8 @@
107107
"notebooks/paper_reproducibility",
108108
"notebooks/examples/*.zarr" "references.md",
109109
"Readme.md", # hack cause git his acting up
110+
"notebooks/developers_resources/storage_format/*.ipynb",
111+
"notebooks/developers_resources/storage_format/Readme.md",
110112
]
111113
# Ignore warnings.
112114
nitpicky = False # TODO: solve upstream.

datasets/README.md

Lines changed: 15 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -4,20 +4,21 @@ The example notebooks operate on a set of spatial omics datasets that can be dow
44

55
Here you can find the dataset hosted in S3 object storage.
66

7-
| Dataset | .zarr.zip | S3 (see note below!) |
8-
| :-------------------------: | :----------------------------------------------------------------------------------: | :------------------------------------------------------------------------------------: |
9-
| cosmx_io | <https://s3.embl.de/spatialdata/spatialdata-sandbox/cosmx_io.zip> | <https://s3.embl.de/spatialdata/spatialdata-sandbox/cosmx_io.zarr/> |
10-
| mcmicro_io | <https://s3.embl.de/spatialdata/spatialdata-sandbox/mcmicro_io.zip> | <https://s3.embl.de/spatialdata/spatialdata-sandbox/mcmicro_io.zarr/> |
11-
| merfish | <https://s3.embl.de/spatialdata/spatialdata-sandbox/merfish.zip> | <https://s3.embl.de/spatialdata/spatialdata-sandbox/merfish.zarr/> |
12-
| mibitof | <https://s3.embl.de/spatialdata/spatialdata-sandbox/mibitof.zip> | <https://s3.embl.de/spatialdata/spatialdata-sandbox/mibitof.zarr/> |
13-
| steinbock_io | <https://s3.embl.de/spatialdata/spatialdata-sandbox/steinbock_io.zip> | <https://s3.embl.de/spatialdata/spatialdata-sandbox/steinbock_io.zarr/> |
14-
| toy | <https://s3.embl.de/spatialdata/spatialdata-sandbox/toy.zip> | <https://s3.embl.de/spatialdata/spatialdata-sandbox/toy.zarr/> |
15-
| visium | <https://s3.embl.de/spatialdata/spatialdata-sandbox/visium.zip> | <https://s3.embl.de/spatialdata/spatialdata-sandbox/visium.zarr/> |
16-
| visium_io | <https://s3.embl.de/spatialdata/spatialdata-sandbox/visium_io.zip> | <https://s3.embl.de/spatialdata/spatialdata-sandbox/visium_io.zarr/> |
17-
| visium_associated_xenium_io | <https://s3.embl.de/spatialdata/spatialdata-sandbox/visium_associated_xenium_io.zip> | <https://s3.embl.de/spatialdata/spatialdata-sandbox/visium_associated_xenium_io.zarr/> |
18-
| xenium_rep1_io | <https://s3.embl.de/spatialdata/spatialdata-sandbox/xenium_rep1_io.zip> | <https://s3.embl.de/spatialdata/spatialdata-sandbox/xenium_rep1_io.zarr/> |
19-
| xenium_rep2_io | <https://s3.embl.de/spatialdata/spatialdata-sandbox/xenium_rep2_io.zip> | <https://s3.embl.de/spatialdata/spatialdata-sandbox/xenium_rep2_io.zarr/> |
7+
| Dataset | .zarr.zip | S3 (see note below!) |
8+
| :------------------------------------------------------------------------------------------------------------------------------------------------------: | :----------------------------------------------------------------------------------: | :-----------------------------------------------------------------------------------: |
9+
| cosmx_io | <https://s3.embl.de/spatialdata/spatialdata-sandbox/cosmx_io.zip> | <https://s3.embl.de/spatialdata/spatialdata-sandbox/cosmx_io.zarr> |
10+
| mcmicro_io | <https://s3.embl.de/spatialdata/spatialdata-sandbox/mcmicro_io.zip> | <https://s3.embl.de/spatialdata/spatialdata-sandbox/mcmicro_io.zarr> |
11+
| merfish | <https://s3.embl.de/spatialdata/spatialdata-sandbox/merfish.zip> | <https://s3.embl.de/spatialdata/spatialdata-sandbox/merfish.zarr> |
12+
| mibitof | <https://s3.embl.de/spatialdata/spatialdata-sandbox/mibitof.zip> | <https://s3.embl.de/spatialdata/spatialdata-sandbox/mibitof.zarr> |
13+
| steinbock_io | <https://s3.embl.de/spatialdata/spatialdata-sandbox/steinbock_io.zip> | <https://s3.embl.de/spatialdata/spatialdata-sandbox/steinbock_io.zarr> |
14+
| toy | <https://s3.embl.de/spatialdata/spatialdata-sandbox/toy.zip> | <https://s3.embl.de/spatialdata/spatialdata-sandbox/toy.zarr> |
15+
| visium | <https://s3.embl.de/spatialdata/spatialdata-sandbox/visium.zip> | <https://s3.embl.de/spatialdata/spatialdata-sandbox/visium.zarr> |
16+
| visium_io | <https://s3.embl.de/spatialdata/spatialdata-sandbox/visium_io.zip> | <https://s3.embl.de/spatialdata/spatialdata-sandbox/visium_io.zarr> |
17+
| visium_associated_xenium_io | <https://s3.embl.de/spatialdata/spatialdata-sandbox/visium_associated_xenium_io.zip> | <https://s3.embl.de/spatialdata/spatialdata-sandbox/visium_associated_xenium_io.zarr> |
18+
| xenium_rep1_io | <https://s3.embl.de/spatialdata/spatialdata-sandbox/xenium_rep1_io.zip> | <https://s3.embl.de/spatialdata/spatialdata-sandbox/xenium_rep1_io.zarr> |
19+
| xenium_rep2_io | <https://s3.embl.de/spatialdata/spatialdata-sandbox/xenium_rep2_io.zip> | <https://s3.embl.de/spatialdata/spatialdata-sandbox/xenium_rep2_io.zarr> |
20+
| [additional resources for methods developers](https://github.com/scverse/spatialdata-notebooks/blob/main/notebooks/developers_resources/storage_format/) | - | - |
2021

2122
## Note
2223

23-
Opening the above URLs in a web browser would not work, you need to treat the URLs as Zarr stores. For example if you append `.zgroup` to any of the URLs above you will be able to see that file.
24+
Opening the above URLs in a web browser would not work, you need to treat the URLs as Zarr stores. For example if you append `/.zgroup` to any of the `.zarr` URLs above you will be able to see that file.
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
!data/
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
!*.zarr
Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
# Examples covering the whole storage specification
2+
3+
This directory offers comprehensive resources for developers that want to interface their methods with the SpatialData format in a robust way.
4+
5+
## Why this repository
6+
7+
The file storage format adopted by SpatialData is built on top of the latest version of the well-documented [OME-NGFF specification](https://ngff.openmicroscopy.org/latest/index.html), but it also uses _some_ less-documented features of the OME-NGFF specification that are still [under review](https://github.com/ome/ngff/pulls?q=is%3Apr+is%3Aopen+sort%3Aupdated-desc), or experimental storage strategies that will be eventually discussed with the NGFF community.
8+
This repository addresses the need for communicating the storage specification to other developers in a complete and robust way.
9+
10+
## What this repository contains
11+
12+
This directory contains notebooks that operate on lightweight datasets.
13+
14+
- Each notebook covers a particular aspect of the storage specification and ~~all the~~ _the main (work in progress)_ edge cases of the specification are covered in at least one of the notebooks.
15+
- All the notebooks are run every 24h (work in progress, automatic run temporarily disabled) against the `main` branch of the `spatialdata` repository. Each notebook creates a dataset, writes it to disk, reloads it in memory, rewrites it to disk to check for consistency, reloads it again in memory and plots it.
16+
- The disk storage is committed to GitHub so that the output of each daily run is associated to a commit, the commit message is "autorun: storage format; spatialdata from <commit hash> <optional (commit tag)>". Examples of commit messages are:
17+
- `autorun: storage format; spatialdata from al29fak`
18+
- `autorun: storage format; spatialdata from fa096da (v0.0.12)`
19+
- The `.zarr` data produced by every run is available in the current directory, in the commit corresponding to the run.
20+
- The data is also [uploaded to S3](https://refined-github-html-preview.kidonng.workers.dev/scverse/spatialdata-notebooks/raw/dev_notebooks/notebooks/developers_resources/storage_format/index.html), both as Zarr directories and as zipped files.
21+
22+
## How to use this repository
23+
24+
Practically, a third party tool (e.g. R reader, format converter, JavaScript data visualizer, etc.) that runs correctly on the lightweight datasets from this repository, should be guaranteed to run correctly on any SpatialData dataset.
25+
26+
We recommend the following.
27+
28+
- Implement your readers on the data from the latest run available (look for the latest commit with message `autorun: storage format; ...`).
29+
- Set up an automated test (e.g. daily) that gets the latest converted data (you can use a `git pull` or download the data from S3) and runs your code on it.
30+
- If your reader fails, you can inspect the corresponding commit in this repository to see what has changed in the storage specification; in particular, you may find useful to compare different commits using the GitHub compare function, accessible with the following syntax: https://github.com/scverse/spatialdata-notebooks/compare/267adb1..5847084
31+
32+
## Important technical notes
33+
34+
- The most crucial part of the metadata is stored, for each spatial element, in the `.zattr` file. [Example](transformation_identity.zarr/images/blobs_image/.zattrs).
35+
- The `zmetadata` in the root folder stores redundant information and is used for storage systems that do not support `ls` operations (e.g. S3). [Example](transformation_identity.zarr/zmetadata).
36+
- Please keep in mind that the data that we generate daily are produced against the latest `main` and not the latest release. This means that in the event of a format change (which should anyway happen less and less frequently as the frameworks become more mature), this does not immediately translate into a bug for the user. In fact, the user will still be using the latest release version for a while, giving time to developers to update the tools before the users are affected.
37+
- When the format will become more mature we will provide converters between previous versions of the format. Luckily, heavy data like images and labels are stable from NGFF v0.4, therefore the converters will mostly perform lightweight conversions of the metadata and relatively small conversions of the geometries.
Lines changed: 121 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,121 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"id": "7bf33a06-009a-487a-991b-715a28b1069b",
6+
"metadata": {},
7+
"source": [
8+
"# Scope and description"
9+
]
10+
},
11+
{
12+
"cell_type": "markdown",
13+
"id": "d96031b8-ee40-43ca-9001-011f4751d4c1",
14+
"metadata": {},
15+
"source": [
16+
"<One line description>\n",
17+
"\n",
18+
"Elements contained:\n",
19+
"- <Element>\n",
20+
"- <Another element>\n",
21+
"\n",
22+
"Annotations contained:\n",
23+
"- <table annotating...>\n",
24+
"\n",
25+
"<Additional notes>"
26+
]
27+
},
28+
{
29+
"cell_type": "markdown",
30+
"id": "5a4c6566-882a-48d4-a4cd-0777332a5a99",
31+
"metadata": {},
32+
"source": [
33+
"# Prepare the data"
34+
]
35+
},
36+
{
37+
"cell_type": "code",
38+
"execution_count": null,
39+
"id": "094509a1-86fb-4aa6-b920-11e34ed43e1c",
40+
"metadata": {},
41+
"outputs": [],
42+
"source": [
43+
"NAME = \"name_of_the_notebook\""
44+
]
45+
},
46+
{
47+
"cell_type": "code",
48+
"execution_count": null,
49+
"id": "332c9c62-7451-4d8e-ba9e-13c12131c312",
50+
"metadata": {},
51+
"outputs": [],
52+
"source": [
53+
"import spatialdata as sd\n",
54+
"import spatialdata_plot\n",
55+
"from spatialdata.datasets import blobs\n",
56+
"from io_utils import delete_old_data, write_sdata_and_check_consistency\n",
57+
"\n",
58+
"delete_old_data(name=NAME)\n",
59+
"sdata = <create sdata here>\n",
60+
"sdata"
61+
]
62+
},
63+
{
64+
"cell_type": "markdown",
65+
"id": "800a540b-2906-46c9-8245-99def219692f",
66+
"metadata": {},
67+
"source": [
68+
"# Read-write and IO validation"
69+
]
70+
},
71+
{
72+
"cell_type": "code",
73+
"execution_count": null,
74+
"id": "b6c01345-27de-42f3-af07-75a43e84808a",
75+
"metadata": {},
76+
"outputs": [],
77+
"source": [
78+
"write_sdata_and_check_consistency(sdata=sdata, name=NAME)"
79+
]
80+
},
81+
{
82+
"cell_type": "markdown",
83+
"id": "06806f13-f346-410f-ab04-fdd61fd77a10",
84+
"metadata": {},
85+
"source": [
86+
"# Plot the data"
87+
]
88+
},
89+
{
90+
"cell_type": "code",
91+
"execution_count": null,
92+
"id": "af8d7108-8b7c-461e-871e-77d82de82af6",
93+
"metadata": {},
94+
"outputs": [],
95+
"source": [
96+
"<plot teh data>"
97+
]
98+
}
99+
],
100+
"metadata": {
101+
"kernelspec": {
102+
"display_name": "Python 3 (ipykernel)",
103+
"language": "python",
104+
"name": "python3"
105+
},
106+
"language_info": {
107+
"codemirror_mode": {
108+
"name": "ipython",
109+
"version": 3
110+
},
111+
"file_extension": ".py",
112+
"mimetype": "text/x-python",
113+
"name": "python",
114+
"nbconvert_exporter": "python",
115+
"pygments_lexer": "ipython3",
116+
"version": "3.10.12"
117+
}
118+
},
119+
"nbformat": 4,
120+
"nbformat_minor": 5
121+
}

0 commit comments

Comments
 (0)