Skip to content

Commit ac8ad74

Browse files
authored
Quality metric example notebook (#449)
* Add notebook. Upgrade js pkg version. Add support for obsFeatureColumns in anndata wrapper * Bump version * Lint
1 parent 8fbb0f9 commit ac8ad74

File tree

5 files changed

+271
-5
lines changed

5 files changed

+271
-5
lines changed
Lines changed: 251 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,251 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"metadata": {
6+
"nbsphinx": "hidden"
7+
},
8+
"source": [
9+
"# Vitessce Widget Tutorial"
10+
]
11+
},
12+
{
13+
"cell_type": "markdown",
14+
"metadata": {},
15+
"source": [
16+
"# Visualization of single-cell RNA seq data"
17+
]
18+
},
19+
{
20+
"cell_type": "markdown",
21+
"metadata": {},
22+
"source": [
23+
"## 1. Import dependencies\n",
24+
"\n",
25+
"We need to import the classes and functions that we will be using from the corresponding packages."
26+
]
27+
},
28+
{
29+
"cell_type": "code",
30+
"execution_count": null,
31+
"metadata": {},
32+
"outputs": [],
33+
"source": [
34+
"import os\n",
35+
"from os.path import join, isfile, isdir\n",
36+
"from urllib.request import urlretrieve\n",
37+
"from anndata import read_h5ad\n",
38+
"import scanpy as sc\n",
39+
"\n",
40+
"from vitessce import (\n",
41+
" VitessceConfig,\n",
42+
" Component as cm,\n",
43+
" CoordinationType as ct,\n",
44+
" AnnDataWrapper,\n",
45+
")\n",
46+
"from vitessce.data_utils import (\n",
47+
" optimize_adata,\n",
48+
" VAR_CHUNK_SIZE,\n",
49+
")"
50+
]
51+
},
52+
{
53+
"cell_type": "markdown",
54+
"metadata": {},
55+
"source": [
56+
"## 2. Download the data\n",
57+
"\n",
58+
"For this example, we need to download a dataset from the COVID-19 Cell Atlas https://www.covid19cellatlas.org/index.healthy.html#habib17."
59+
]
60+
},
61+
{
62+
"cell_type": "code",
63+
"execution_count": null,
64+
"metadata": {},
65+
"outputs": [],
66+
"source": [
67+
"adata_filepath = join(\"data\", \"habib17.processed.h5ad\")\n",
68+
"if not isfile(adata_filepath):\n",
69+
" os.makedirs(\"data\", exist_ok=True)\n",
70+
" urlretrieve('https://covid19.cog.sanger.ac.uk/habib17.processed.h5ad', adata_filepath)"
71+
]
72+
},
73+
{
74+
"cell_type": "markdown",
75+
"metadata": {},
76+
"source": [
77+
"## 3. Load the data\n",
78+
"\n",
79+
"Note: this function may print a `FutureWarning`"
80+
]
81+
},
82+
{
83+
"cell_type": "code",
84+
"execution_count": null,
85+
"metadata": {},
86+
"outputs": [],
87+
"source": [
88+
"adata = read_h5ad(adata_filepath)"
89+
]
90+
},
91+
{
92+
"cell_type": "markdown",
93+
"metadata": {
94+
"tags": []
95+
},
96+
"source": [
97+
"## 3.1. Preprocess the Data For Visualization\n",
98+
"\n",
99+
"This dataset contains 25,587 genes. We prepare to visualize the top 50 highly variable genes for the heatmap as ranked by dispersion norm, although one may use any boolean array filter for the heatmap."
100+
]
101+
},
102+
{
103+
"cell_type": "code",
104+
"execution_count": null,
105+
"metadata": {},
106+
"outputs": [],
107+
"source": [
108+
"top_dispersion = adata.var[\"dispersions_norm\"][\n",
109+
" sorted(\n",
110+
" range(len(adata.var[\"dispersions_norm\"])),\n",
111+
" key=lambda k: adata.var[\"dispersions_norm\"][k],\n",
112+
" )[-51:][0]\n",
113+
"]\n",
114+
"adata.var[\"top_highly_variable\"] = (\n",
115+
" adata.var[\"dispersions_norm\"] > top_dispersion\n",
116+
")"
117+
]
118+
},
119+
{
120+
"cell_type": "markdown",
121+
"metadata": {},
122+
"source": [
123+
"## 3.2 Save the Data to Zarr store\n",
124+
"\n",
125+
"We want to convert the original `h5ad` file to a [Zarr](https://zarr.readthedocs.io/en/stable/) store, which Vitessce is able to load. We can use the `optimize_adata` function to ensure that all arrays and dataframe columns that we intend to use in our visualization are in the optimal format to be loaded by Vitessce. This function will cast arrays to numerical data types that take up less space (as long as the values allow). Note: unused arrays and columns (i.e., not specified in any of the parameters to `optimize_adata`) will not be copied into the new AnnData object."
126+
]
127+
},
128+
{
129+
"cell_type": "code",
130+
"execution_count": null,
131+
"metadata": {},
132+
"outputs": [],
133+
"source": [
134+
"zarr_filepath = join(\"data\", \"habib17.h5ad.zarr\")\n",
135+
"if not isdir(zarr_filepath):\n",
136+
" adata.write_zarr(zarr_filepath, chunks=[adata.shape[0], VAR_CHUNK_SIZE])"
137+
]
138+
},
139+
{
140+
"cell_type": "code",
141+
"execution_count": null,
142+
"metadata": {},
143+
"outputs": [],
144+
"source": [
145+
"vc = VitessceConfig(\n",
146+
" schema_version=\"1.0.17\",\n",
147+
" name='Habib et al',\n",
148+
" description='COVID-19 Healthy Donor Brain'\n",
149+
")\n",
150+
"\n",
151+
"# Add data.\n",
152+
"dataset = vc.add_dataset(name='Brain').add_object(AnnDataWrapper(\n",
153+
" adata_path=zarr_filepath,\n",
154+
" obs_embedding_paths=[\"obsm/X_umap\"],\n",
155+
" obs_embedding_names=[\"UMAP\"],\n",
156+
" obs_set_paths=[\"obs/CellType\"],\n",
157+
" obs_set_names=[\"Cell Type\"],\n",
158+
" obs_feature_matrix_path=\"X\",\n",
159+
" initial_feature_filter_path=\"var/top_highly_variable\",\n",
160+
" coordination_values={\n",
161+
" \"obsType\": 'cell',\n",
162+
" \"featureType\": 'gene',\n",
163+
" \"featureValueType\": 'expression',\n",
164+
" },\n",
165+
")).add_object(AnnDataWrapper(\n",
166+
" adata_path=zarr_filepath,\n",
167+
" obs_feature_column_paths=[\"obs/percent_mito\"],\n",
168+
" coordination_values={\n",
169+
" \"obsType\": 'cell',\n",
170+
" \"featureType\": 'qualityMetric',\n",
171+
" \"featureValueType\": 'value',\n",
172+
" }\n",
173+
"))\n",
174+
"\n",
175+
"# Add views.\n",
176+
"scatterplot = vc.add_view(cm.SCATTERPLOT, dataset=dataset, mapping=\"UMAP\")\n",
177+
"scatterplot_2 = vc.add_view(cm.SCATTERPLOT, dataset=dataset, mapping=\"UMAP\")\n",
178+
"cell_sets = vc.add_view(cm.OBS_SETS, dataset=dataset)\n",
179+
"genes = vc.add_view(cm.FEATURE_LIST, dataset=dataset)\n",
180+
"histogram = vc.add_view(cm.FEATURE_VALUE_HISTOGRAM, dataset=dataset)\n",
181+
"\n",
182+
"# Link views.\n",
183+
"\n",
184+
"# Color one of the two scatterplots by the percent_mito quality metric.\n",
185+
"# Also use this quality metric for the histogram values.\n",
186+
"vc.link_views_by_dict([histogram, scatterplot_2], {\n",
187+
" \"obsType\": 'cell',\n",
188+
" \"featureType\": 'qualityMetric',\n",
189+
" \"featureValueType\": 'value',\n",
190+
" \"featureSelection\": [\"percent_mito\"],\n",
191+
" \"obsColorEncoding\": \"geneSelection\",\n",
192+
"}, meta=False)\n",
193+
"\n",
194+
"# Synchronize the zooming and panning of the two scatterplots\n",
195+
"vc.link_views_by_dict([scatterplot, scatterplot_2], {\n",
196+
" \"embeddingZoom\": None,\n",
197+
" \"embeddingTargetX\": None,\n",
198+
" \"embeddingTargetY\": None,\n",
199+
"}, meta=False)\n",
200+
"\n",
201+
"# Define the layout.\n",
202+
"vc.layout((scatterplot | (cell_sets / genes)) / (scatterplot_2 | histogram));"
203+
]
204+
},
205+
{
206+
"cell_type": "markdown",
207+
"metadata": {},
208+
"source": [
209+
"## 5. Create the widget\n"
210+
]
211+
},
212+
{
213+
"cell_type": "code",
214+
"execution_count": null,
215+
"metadata": {},
216+
"outputs": [],
217+
"source": [
218+
"vw = vc.widget()\n",
219+
"vw"
220+
]
221+
},
222+
{
223+
"cell_type": "code",
224+
"execution_count": null,
225+
"metadata": {},
226+
"outputs": [],
227+
"source": []
228+
}
229+
],
230+
"metadata": {
231+
"kernelspec": {
232+
"display_name": "Python 3 (ipykernel)",
233+
"language": "python",
234+
"name": "python3"
235+
},
236+
"language_info": {
237+
"codemirror_mode": {
238+
"name": "ipython",
239+
"version": 3
240+
},
241+
"file_extension": ".py",
242+
"mimetype": "text/x-python",
243+
"name": "python",
244+
"nbconvert_exporter": "python",
245+
"pygments_lexer": "ipython3",
246+
"version": "3.10.14"
247+
}
248+
},
249+
"nbformat": 4,
250+
"nbformat_minor": 4
251+
}

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ build-backend = "hatchling.build"
44

55
[project]
66
name = "vitessce"
7-
version = "3.6.2"
7+
version = "3.6.3"
88
authors = [
99
{ name="Mark Keller", email="[email protected]" },
1010
]

src/vitessce/file_def_utils.py

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -101,6 +101,17 @@ def gen_obs_labels_schema(options: dict, paths: Optional[list[str]] = None, name
101101
return options
102102

103103

104+
def gen_obs_feature_columns_schema(options: dict, obs_feature_column_paths: Optional[list[str]] = None):
105+
if obs_feature_column_paths is not None:
106+
options["obsFeatureColumns"] = [
107+
{
108+
"path": col_path
109+
}
110+
for col_path in obs_feature_column_paths
111+
]
112+
return options
113+
114+
104115
def gen_path_schema(key: str, path: Optional[str], options: dict):
105116
if path is not None:
106117
options[key] = {

src/vitessce/widget.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -601,7 +601,7 @@ class VitessceWidget(anywidget.AnyWidget):
601601

602602
next_port = DEFAULT_PORT
603603

604-
js_package_version = Unicode('3.6.3').tag(sync=True)
604+
js_package_version = Unicode('3.6.4').tag(sync=True)
605605
js_dev_mode = Bool(False).tag(sync=True)
606606
custom_js_url = Unicode('').tag(sync=True)
607607
plugin_esm = List(trait=Unicode(''), default_value=[]).tag(sync=True)
@@ -614,7 +614,7 @@ class VitessceWidget(anywidget.AnyWidget):
614614

615615
store_urls = List(trait=Unicode(''), default_value=[]).tag(sync=True)
616616

617-
def __init__(self, config, height=600, theme='auto', uid=None, port=None, proxy=False, js_package_version='3.6.3', js_dev_mode=False, custom_js_url='', plugins=None, remount_on_uid_change=True, prefer_local=True, invoke_timeout=300000, invoke_batched=True, page_mode=False, page_esm=None, prevent_scroll=True):
617+
def __init__(self, config, height=600, theme='auto', uid=None, port=None, proxy=False, js_package_version='3.6.4', js_dev_mode=False, custom_js_url='', plugins=None, remount_on_uid_change=True, prefer_local=True, invoke_timeout=300000, invoke_batched=True, page_mode=False, page_esm=None, prevent_scroll=True):
618618
"""
619619
Construct a new Vitessce widget.
620620
@@ -750,7 +750,7 @@ def _plugin_command(self, params, buffers):
750750
# Launch Vitessce using plain HTML representation (no ipywidgets)
751751

752752

753-
def ipython_display(config, height=600, theme='auto', base_url=None, host_name=None, uid=None, port=None, proxy=False, js_package_version='3.6.3', js_dev_mode=False, custom_js_url='', plugins=None, remount_on_uid_change=True, page_mode=False, page_esm=None):
753+
def ipython_display(config, height=600, theme='auto', base_url=None, host_name=None, uid=None, port=None, proxy=False, js_package_version='3.6.4', js_dev_mode=False, custom_js_url='', plugins=None, remount_on_uid_change=True, page_mode=False, page_esm=None):
754754
from IPython.display import display, HTML
755755
uid_str = "vitessce" + get_uid_str(uid)
756756

src/vitessce/wrappers.py

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,7 @@
3333
gen_sdata_obs_spots_schema,
3434
gen_sdata_obs_sets_schema,
3535
gen_sdata_obs_feature_matrix_schema,
36+
gen_obs_feature_columns_schema,
3637
)
3738

3839
from .constants import (
@@ -1192,7 +1193,7 @@ def raise_error_if_more_than_one(inputs):
11921193

11931194

11941195
class AnnDataWrapper(AbstractWrapper):
1195-
def __init__(self, adata_path=None, adata_url=None, adata_store=None, adata_artifact=None, ref_path=None, ref_url=None, ref_artifact=None, obs_feature_matrix_path=None, feature_filter_path=None, initial_feature_filter_path=None, obs_set_paths=None, obs_set_names=None, obs_locations_path=None, obs_segmentations_path=None, obs_embedding_paths=None, obs_embedding_names=None, obs_embedding_dims=None, obs_spots_path=None, obs_points_path=None, feature_labels_path=None, obs_labels_path=None, convert_to_dense=True, coordination_values=None, obs_labels_paths=None, obs_labels_names=None, is_zip=None, **kwargs):
1196+
def __init__(self, adata_path=None, adata_url=None, adata_store=None, adata_artifact=None, ref_path=None, ref_url=None, ref_artifact=None, obs_feature_matrix_path=None, obs_feature_column_paths=None, feature_filter_path=None, initial_feature_filter_path=None, obs_set_paths=None, obs_set_names=None, obs_locations_path=None, obs_segmentations_path=None, obs_embedding_paths=None, obs_embedding_names=None, obs_embedding_dims=None, obs_spots_path=None, obs_points_path=None, feature_labels_path=None, obs_labels_path=None, convert_to_dense=True, coordination_values=None, obs_labels_paths=None, obs_labels_names=None, is_zip=None, **kwargs):
11961197
"""
11971198
Wrap an AnnData object by creating an instance of the ``AnnDataWrapper`` class.
11981199
@@ -1218,6 +1219,7 @@ def __init__(self, adata_path=None, adata_url=None, adata_store=None, adata_arti
12181219
:param str obs_labels_path: (DEPRECATED) The name of a column containing observation labels (e.g., alternate cell IDs), instead of the default index in `obs` of the AnnData store. Use `obs_labels_paths` and `obs_labels_names` instead. This arg will be removed in a future release.
12191220
:param list[str] obs_labels_paths: The names of columns containing observation labels (e.g., alternate cell IDs), instead of the default index in `obs` of the AnnData store.
12201221
:param list[str] obs_labels_names: The optional display names of columns containing observation labels (e.g., alternate cell IDs), instead of the default index in `obs` of the AnnData store.
1222+
:param list[str] obs_feature_column_paths: The paths to columns (typically in `obs`) that contain numerical values per observation (e.g., cell size, quality control metrics, etc.) which are not part of the main expression matrix.
12211223
:param bool convert_to_dense: Whether or not to convert `X` to dense the zarr store (dense is faster but takes more disk space).
12221224
:param coordination_values: Coordination values for the file definition.
12231225
:param is_zip: Boolean indicating whether the Zarr store is in a zipped format.
@@ -1289,6 +1291,7 @@ def __init__(self, adata_path=None, adata_url=None, adata_store=None, adata_arti
12891291
self._spatial_spots_obsm = obs_spots_path
12901292
self._spatial_points_obsm = obs_points_path
12911293
self._feature_labels = feature_labels_path
1294+
self._obs_feature_column_paths = obs_feature_column_paths
12921295
# Support legacy provision of single obs labels path
12931296
if (obs_labels_path is not None):
12941297
warnings.warn("`obs_labels_path` will be deprecated in a future release.", DeprecationWarning)
@@ -1357,6 +1360,7 @@ def get_anndata_zarr(base_url):
13571360
options = gen_obs_feature_matrix_schema(options, self._expression_matrix, self._gene_var_filter, self._matrix_gene_var_filter)
13581361
options = gen_feature_labels_schema(self._feature_labels, options)
13591362
options = gen_obs_labels_schema(options, self._obs_labels_elems, self._obs_labels_names)
1363+
options = gen_obs_feature_columns_schema(options, self._obs_feature_column_paths)
13601364

13611365
if len(options.keys()) > 0:
13621366
if self.is_h5ad:

0 commit comments

Comments
 (0)