Skip to content

Commit ff16fc1

Browse files
authored
Merge pull request #70 from haniffalab/dev
Preparation for WebAtlas manuscript
2 parents 4b33b98 + 12fbe3e commit ff16fc1

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

73 files changed

+3797
-2797
lines changed

.github/workflows/tests-python.yml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,10 @@ jobs:
1111
run:
1212
runs-on: ubuntu-latest
1313
steps:
14+
- name: Install libvips
15+
run: |
16+
sudo apt-get update
17+
sudo apt-get install -y --no-install-recommends libvips
1418
- name: Checkout
1519
uses: actions/checkout@v2
1620
- name: Set up Python 3.8

CITATION.cff

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
cff-version: 1.2.0
22
type: software
33
message: "If you use this repo, please cite it"
4-
title: "Vitessce Pipeline"
5-
url: "https://github.com/haniffalab/vitessce-pipeline"
4+
title: "WebAtlas Pipeline"
5+
url: "https://github.com/haniffalab/webatlas-pipeline"
66
doi: 10.5281/zenodo.7405818
77
authors:
88
- family-names: "Li"
@@ -11,7 +11,7 @@ authors:
1111
- family-names: "Horsfall"
1212
given-names: "Dave"
1313
orcid: "https://orcid.org/0000-0002-8086-812X"
14-
- family-names: "Basurto Lozada"
14+
- family-names: "Basurto-Lozada"
1515
given-names: "Daniela"
1616
orcid: "https://orcid.org/0000-0003-3943-8424"
1717
- family-names: "Prete"

README.md

Lines changed: 56 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,60 @@
1-
[![python-tests](https://github.com/haniffalab/vitessce-pipeline/actions/workflows/tests-python.yml/badge.svg)](https://github.com/haniffalab/vitessce-pipeline/actions/workflows/tests-python.yml)
2-
[![codecov](https://codecov.io/gh/haniffalab/vitessce-pipeline/branch/main/graph/badge.svg?token=7HQVFH08WJ)](https://codecov.io/gh/haniffalab/vitessce-pipeline/branch/main)
1+
[![python-tests](https://github.com/haniffalab/webatlas-pipeline/actions/workflows/tests-python.yml/badge.svg)](https://github.com/haniffalab/webatlas-pipeline/actions/workflows/tests-python.yml)
2+
[![codecov](https://codecov.io/gh/haniffalab/webatlas-pipeline/branch/main/graph/badge.svg?token=7HQVFH08WJ)](https://codecov.io/gh/haniffalab/webatlas-pipeline/branch/main)
33

4-
# Vitessce Pipeline
4+
# WebAtlas Pipeline
55

6-
[![docs](https://img.shields.io/badge/Documentation-online-blue)](https://haniffalab.github.io/vitessce-pipeline)
7-
[![demo](https://img.shields.io/badge/Demos-view-blue)](https://haniffalab.github.io/vitessce-pipeline/demos.html)
6+
[![docs](https://img.shields.io/badge/Documentation-online-blue)](https://haniffalab.github.io/webatlas-pipeline)
7+
[![demo](https://img.shields.io/badge/Demos-view-blue)](https://haniffalab.github.io/webatlas-pipeline/demos.html)
88
[![doi](https://zenodo.org/badge/DOI/10.5281/zenodo.7405818.svg)](https://doi.org/10.5281/zenodo.7405818)
99

10-
This Nextflow pipeline processes spatial and single-cell experiment data for visualisation in [vitessce-app](https://github.com/haniffalab/vitessce-app). The pipeline generates data files for [supported data types](http://vitessce.io/docs/data-types-file-types/), and builds a [view config](http://vitessce.io/docs/view-config-json/).
10+
This Nextflow pipeline processes spatial and single-cell experiment data for visualisation in [webatlas-app](https://github.com/haniffalab/webatlas-app). The pipeline generates data files for [supported data types](http://vitessce.io/docs/data-types-file-types/), and builds a [view config](http://vitessce.io/docs/view-config-json/).
11+
12+
13+
## Usage
14+
15+
The pipeline can handle data from `h5ad` files, image `tif` files, SpaceRanger output, Xenium output and MERSCOPE output. It can also generate image files from data files.
16+
17+
Running the pipeline requires a parameters file that defines configuration options and the data to be processed.
18+
Full instructions and parameters definitions for this files are available in the [documentation](https://haniffalab.com/webatlas-pipeline/setup.html)
19+
20+
A parameters file looks like
21+
22+
```yaml
23+
outdir: "/path/to/output/"
24+
25+
args:
26+
h5ad:
27+
compute_embeddings: "True"
28+
29+
projects:
30+
- project: project_1
31+
datasets:
32+
- dataset: dataset_1
33+
data:
34+
-
35+
data_type: h5ad
36+
data_path: /path/to/project_1/dataset_1/anndata.h5ad
37+
-
38+
data_type: raw_image
39+
data_path: /path/to/project_1/dataset_1/raw_image.tif
40+
-
41+
data_type: label_image
42+
data_path: /path/to/project_1/dataset_1/label_image.tif
43+
44+
vitessce_options:
45+
spatial:
46+
xy: "obsm/spatial"
47+
mappings:
48+
obsm/X_umap: [0,1]
49+
layout: "simple"
50+
```
51+
52+
53+
The pipeline can then be run like
54+
55+
```sh
56+
nextflow run main.nf -params-file /path/to/run-params.yaml -entry Full_pipeline
57+
```
58+
59+
60+
Parameters file templates are available in the `templates` directory.

bin/build_config.py

Lines changed: 65 additions & 56 deletions
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,7 @@ def build_options(
4242
file_type (str): Type of file supported by Vitessce.
4343
file_path (str): Path to file.
4444
file_options (dict[str, T.Any]): Dictionary defining the options.
45-
check_exist (bool, optional): Whether to check the given path to confirm the file exists.
45+
check_exist (bool, optional): Whether to check the given path to confirm the file exists.
4646
Defaults to False.
4747
4848
Returns:
@@ -120,89 +120,95 @@ def build_options(
120120

121121

122122
def build_raster_options(
123-
image_zarr: dict[str, dict[str, str]], url: str
123+
images: dict[str, list[dict[str, T.Any]]], url: str
124124
) -> dict[str, T.Any]:
125125
"""Function that creates the View config's options for image files
126126
127127
Args:
128-
image_zarr (dict[str, dict[str, str]]): Dictionary containing a metadata dictionary
129-
for each image in Zarr format.
130-
url (str): URL to prepend to each file in the config file.
128+
images (dict[str, list[dict[str, T.Any]]], optional): Dictionary containing for each image type key (raw and label)
129+
a list of dictionaries (one per image of that type) with the corresponding path and metadata for that image.
130+
Defaults to {}.
131+
url (str): URL to prepend to each file in the config file.
131132
The URL to the local or remote server that will serve the files
132133
133134
Returns:
134135
dict[str, T.Any]: Options dictionary for View config file
135136
"""
136137
raster_options = {"renderLayers": [], "schemaVersion": "0.0.2", "images": []}
137-
for image in image_zarr.keys():
138-
image_name = os.path.splitext(image)[0]
139-
channel_names = (
140-
image_zarr[image]["channel_names"]
141-
if "channel_names" in image_zarr[image]
142-
else []
143-
)
144-
channel_names, isBitmask = (
145-
(["Labels"], True)
146-
if image_name.split("_")[-1] == "label" and not len(channel_names)
147-
else (channel_names, False)
148-
)
149-
raster_options["renderLayers"].append(image_name)
150-
raster_options["images"].append(
151-
{
152-
"name": image_name,
153-
"url": os.path.join(url, image),
154-
"type": "zarr",
155-
"metadata": {
156-
"isBitmask": isBitmask,
157-
"dimensions": [
158-
{"field": "t", "type": "quantitative", "values": None},
159-
{
160-
"field": "channel",
161-
"type": "nominal",
162-
"values": channel_names,
163-
},
164-
{"field": "y", "type": "quantitative", "values": None},
165-
{"field": "x", "type": "quantitative", "values": None},
166-
],
167-
"isPyramid": True,
168-
"transform": {"translate": {"y": 0, "x": 0}, "scale": 1},
169-
},
170-
}
171-
)
138+
for img_type in images.keys(): # raw, label
139+
for img in images[img_type]:
140+
image_name = os.path.splitext(os.path.basename(img["path"]))[0]
141+
channel_names = (
142+
img["md"]["channel_names"]
143+
if "channel_names" in img["md"] and len(img["md"]["channel_names"])
144+
else (
145+
["Labels"]
146+
if img_type == "label"
147+
else [f"Channel {x}" for x in range(int(img["md"]["C"]))]
148+
)
149+
)
150+
isBitmask = img_type == "label"
151+
raster_options["renderLayers"].append(image_name)
152+
raster_options["images"].append(
153+
{
154+
"name": image_name,
155+
"url": os.path.join(url, os.path.basename(img["path"])),
156+
"type": "zarr",
157+
"metadata": {
158+
"isBitmask": isBitmask,
159+
"dimensions": [
160+
{"field": "t", "type": "quantitative", "values": None},
161+
{
162+
"field": "channel",
163+
"type": "nominal",
164+
"values": channel_names,
165+
},
166+
{"field": "y", "type": "quantitative", "values": None},
167+
{"field": "x", "type": "quantitative", "values": None},
168+
],
169+
"isPyramid": True,
170+
"transform": {"translate": {"y": 0, "x": 0}, "scale": 1},
171+
},
172+
}
173+
)
172174
return raster_options
173175

174176

175177
def write_json(
176-
title: str = "",
178+
project: str = "",
177179
dataset: str = "",
178180
file_paths: list[str] = [],
179-
image_zarr: dict[str, dict[str, str]] = {},
181+
images: dict[str, list[dict[str, T.Any]]] = {},
180182
url: str = "",
181-
outdir: str = "./",
182-
config_filename_suffix: str = "config.json",
183183
options: dict[str, T.Any] = None,
184184
layout: str = "minimal",
185185
custom_layout: str = None,
186+
title: str = "",
187+
description: str = "",
188+
config_filename_suffix: str = "config.json",
189+
outdir: str = "./",
186190
) -> None:
187191
"""This function writes a Vitessce View config JSON file
188192
189193
Args:
190-
title (str, optional): Title to use in the config file. Defaults to "".
194+
project (str, optional): Project name. Defaults to "".
191195
dataset (str, optional): Dataset name. Defaults to "".
192196
file_paths (list[str], optional): Paths to files that will be included in the config file. Defaults to [].
193-
image_zarr (dict[str, dict[str, str]], optional): Dictionary containing a metadata dictionary
194-
for each image in Zarr format. Defaults to {}.
195-
url (str, optional): URL to prepend to each file in the config file.
197+
images (dict[str, list[dict[str, T.Any]]], optional): Dictionary containing for each image type key (raw and label)
198+
a list of dictionaries (one per image of that type) with the corresponding path and metadata for that image.
199+
Defaults to {}.
200+
url (str, optional): URL to prepend to each file in the config file.
196201
The URL to the local or remote server that will serve the files.
197202
Defaults to "".
198-
outdir (str, optional): Directory in which the config file will be written to. Defaults to "./".
199-
config_filename_suffix (str, optional): Config filename suffix. Defaults to "config.json".
200203
options (dict[str, T.Any], optional): Dictionary with Vitessce config file `options`. Defaults to None.
201204
layout (str, optional): Type of predefined layout to use. Defaults to "minimal".
202205
custom_layout (str, optional): String defining a Vitessce layout following its alternative syntax.
203206
https://vitessce.github.io/vitessce-python/api_config.html#vitessce.config.VitessceConfig.layout
204207
https://github.com/vitessce/vitessce-python/blob/1e100e4f3f6b2389a899552dffe90716ffafc6d5/vitessce/config.py#L855
205208
Defaults to None.
209+
title (str, optional): Data title to show in the visualization. Defaults to "".
210+
config_filename_suffix (str, optional): Config filename suffix. Defaults to "config.json".
211+
outdir (str, optional): Directory in which the config file will be written to. Defaults to "./".
206212
207213
Raises:
208214
SystemExit: If no valid files have been input
@@ -211,17 +217,20 @@ def write_json(
211217

212218
has_files = False
213219

214-
config = VitessceConfig(name=str(title))
215-
config_dataset = config.add_dataset(str(title), str(dataset))
220+
config = VitessceConfig(
221+
name=str(title) if len(title) else str(project),
222+
description=description,
223+
)
224+
config_dataset = config.add_dataset(str(dataset), str(dataset))
216225

217226
coordination_types = defaultdict(lambda: cycle(iter([])))
218-
file_paths_names = {x.split("_")[-1]: x for x in file_paths}
227+
file_paths_names = {x.split("-")[-1]: x for x in file_paths}
219228
dts = set([])
220229

221-
if len(image_zarr.items()):
230+
if images.keys() and any([len(images[k]) for k in images.keys()]):
222231
has_files = True
223232
config_dataset.add_file(
224-
dt.RASTER, ft.RASTER_JSON, options=build_raster_options(image_zarr, url)
233+
dt.RASTER, ft.RASTER_JSON, options=build_raster_options(images, url)
225234
)
226235
dts.add(dt.RASTER)
227236

@@ -346,7 +355,7 @@ def write_json(
346355
if outdir and not os.path.isdir(outdir):
347356
os.mkdir(outdir)
348357
with open(
349-
os.path.join(outdir or "", f"{title}_{dataset}_{config_filename_suffix}"), "w"
358+
os.path.join(outdir or "", f"{project}-{dataset}-{config_filename_suffix}"), "w"
350359
) as out_file:
351360
json.dump(config_json, out_file, indent=2)
352361

bin/consolidate_md.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@
1111
from pathlib import Path
1212

1313

14-
def main(file_in: str) -> None:
14+
def consolidate(file_in: str) -> None:
1515
"""Function to consolidate the metadata of a Zarr file
1616
1717
Args:
@@ -26,4 +26,4 @@ def main(file_in: str) -> None:
2626

2727

2828
if __name__ == "__main__":
29-
fire.Fire(main)
29+
fire.Fire(consolidate)

bin/generate_image.py

Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,57 @@
1+
#!/usr/bin/env python3
2+
"""
3+
generate_image.py
4+
====================================
5+
Generates raw/label images from spatial data
6+
"""
7+
8+
from __future__ import annotations
9+
import fire
10+
import typing as T
11+
import tifffile as tf
12+
from process_spaceranger import visium_label
13+
from process_xenium import xenium_label
14+
from process_merscope import merscope_label, merscope_raw
15+
16+
17+
def create_img(
18+
stem: str,
19+
img_type: str,
20+
file_type: str,
21+
file_path: str,
22+
ref_img: str = None,
23+
args: dict[str, T.Any] = {},
24+
) -> None:
25+
"""This function calls the corresponding function
26+
to write a label image given the metadata provided.
27+
It also obtains the image shape of a reference image if specified.
28+
29+
Args:
30+
stem (str): Prefix for the output image filename.
31+
file_type (str): Type of file containing the metadata from which to
32+
generate the label image.
33+
file_path (str): Path to the metadata file.
34+
ref_img (str, optional): Path to reference image from which to get the
35+
shape for the label image. Defaults to None.
36+
args (dict[str,T.Any], optional): Args to be passed to the appropriate processing function.
37+
Defaults to {}.
38+
"""
39+
40+
if ref_img:
41+
tif_img = tf.TiffFile(ref_img)
42+
args["shape"] = tif_img.pages[0].shape[:2]
43+
44+
if img_type == "label":
45+
if file_type == "visium":
46+
visium_label(stem, file_path, **args)
47+
elif file_type == "merscope":
48+
merscope_label(stem, file_path, **args)
49+
elif file_type == "xenium":
50+
xenium_label(stem, file_path, **args)
51+
elif img_type == "raw":
52+
if file_type == "merscope":
53+
merscope_raw(stem, file_path, **args)
54+
55+
56+
if __name__ == "__main__":
57+
fire.Fire(create_img)

0 commit comments

Comments
 (0)