Skip to content

Commit 23e8039

Browse files
authored
Merge pull request #23 from quadbio/feat/external
Add niche identification tutorial
2 parents 8a708f6 + a8303fc commit 23e8039

File tree

7 files changed

+1552
-124
lines changed

7 files changed

+1552
-124
lines changed

CHANGELOG.md

Lines changed: 13 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -10,14 +10,24 @@ and this project adheres to [Semantic Versioning][].
1010

1111
## [Unreleased]
1212

13+
## [v0.1.3]
14+
15+
### Added
16+
- Added a tutorial on spatial contextualization and niche identification {pr}`23`.
17+
- Implemented a self-mapping mode with only a query dataset {pr}`21`.
18+
- Allow importing a pre-computed dataset of transfered expression values {pr}`21`.
19+
- Allow importing pre-computed neighborhood matrices {pr}`21`.
20+
- Add a tutorial on spatial contextualization and niche identification {pr}`21`.
21+
- Add an equal-weight kernel {pr}`22`.
22+
1323
## [v0.1.2]
1424

1525
### Added
1626
- Included tests for the `check` module, and more tests for the main classes {pr}`15`.
1727
- Implemented the computation of presence scores, following HNOCA-tools {pr}`16`.
18-
- Add a `groupby` parameter to expression transfer evaluation {pr}`16`.
19-
- Add a `test_var_key` parameter to expression transfer evaluation {pr}`19`.
20-
- Add a tutorial on spatial mapping {pr}`19`.
28+
- Added a `groupby` parameter to expression transfer evaluation {pr}`16`.
29+
- Added a `test_var_key` parameter to expression transfer evaluation {pr}`19`.
30+
- Added a tutorial on spatial mapping {pr}`19`.
2131

2232
## [v0.1.1]
2333

README.md

Lines changed: 18 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,16 @@ k-NN-based mapping of cells across representations to tranfer labels, embeddings
1616

1717
Inspired by scanpy's [ingest][] and the [HNOCA-tools][] packages. Check out the [docs][] to learn more, in particular our [tutorials][].
1818

19+
## Key use cases
20+
21+
- Transfer cell type labels and expression values from dissociated to spatial datasets.
22+
- Transfer embeddings between arbitrary query and reference datasets.
23+
- Compute presence scores for query datasets in large reference atlasses.
24+
- Identify niches in spatial datasets by contextualizing latent spaces in spatial coordinates.
25+
- Evaluate the results of transferring labels, embeddings and feature spaces using a variety of metrics.
26+
27+
The core idea of `CellMapper` is to separate the method (k-NN graph with some kernel applied to get a mapping matrix) from the application (mapping across arbitrary representations), to be flexible and fast. The tool currently supports [pynndescent][], [sklearn][], [faiss][] and [rapids][] for neighborhood search, implements a variety of graph kernels, and is closely integrated with `AnnData` objects.
28+
1929
## Installation
2030

2131
You need to have Python 3.10 or newer installed on your system.
@@ -37,20 +47,20 @@ There are two alternative options to install ``cellmapper``:
3747

3848
## Getting started
3949

40-
This package assumes that you have ``ref`` and ``query`` AnnData objects, with a joint embedding computed and stored in ``.obsm``. We explicilty do not compute this joint embedding, but there are plenty of method you can use to get such joint embeddings, e.g. [GimVI][] or [ENVI][] for spatial mapping, [GLUE][], [MIDAS][] and [MOFA+][] for modality translation, and [scVI][], [scANVI][] and [scArches][] for query-to-reference mapping - this is just a small selection!
50+
This package assumes that you have ``query`` and ``reference`` AnnData objects, with a joint embedding computed and stored in ``.obsm``. We explicilty do not compute this joint embedding, but there are plenty of method you can use to get such joint embeddings, e.g. [GimVI][] or [ENVI][] for spatial mapping, [GLUE][], [MIDAS][] and [MOFA+][] for modality translation, and [scVI][], [scANVI][] and [scArches][] for query-to-reference mapping - this is just a small selection!
4151

4252
With a joint embedding in ``.obsm["X_joint"]`` at hand, the simplest way to use ``CellMapper`` is as follows:
4353
```Python
4454
from cellmapper import CellMapper
4555

46-
cmap = CellMapper(ref, query).fit(
56+
cmap = CellMapper(query, reference).fit(
4757
use_rep="X_joint", obs_keys="celltype", obsm_keys="X_umap", layer_key="X"
4858
)
4959
```
5060

51-
This will transfer data from the reference to the query dataset, including celltype labels stored in ``ref.obs``, a UMAP embedding stored in ``ref.obsm``, and expression values stored in ``ref.X``.
61+
This will transfer data from the reference to the query dataset, including celltype labels stored in ``reference.obs``, a UMAP embedding stored in ``reference.obsm``, and expression values stored in ``reference.X``.
5262

53-
There are many ways to customize this, e.g. use different ways to compute k-NN graphs and to turn them into mapping matrices, and we implement a few methods to evaluate whether your k-NN transfer was sucessful. Check out the [docs][] to learn more.
63+
There are many ways to customize this, e.g. use different ways to compute k-NN graphs and to turn them into mapping matrices, and we implement a few methods to evaluate whether your k-NN transfer was sucessful. The tool also implements a `self-mapping` mode (only a query object, no reference), which is useful for spatial contextualization. Check out the [docs][] to learn more.
5464

5565
## Release notes
5666

@@ -74,7 +84,11 @@ Please cite this GitHub repo if you find CellMapper useful for your research.
7484
[coverage]: https://codecov.io/gh/quadbio/cellmapper
7585
[pre-commit]: https://results.pre-commit.ci/latest/github/quadbio/cellmapper/main
7686
[pypi]: https://pypi.org/project/cellmapper/
87+
7788
[faiss]: https://github.com/facebookresearch/faiss
89+
[pynndescent]: https://github.com/lmcinnes/pynndescent
90+
[sklearn]: https://scikit-learn.org/stable/modules/neighbors.html
91+
[rapids]: https://docs.rapids.ai/api/cuml/stable/api/#nearest-neighbors
7892

7993
[ingest]: https://scanpy.readthedocs.io/en/stable/generated/scanpy.tl.ingest.html
8094
[HNOCA-tools]: https://devsystemslab.github.io/HNOCA-tools/

docs/notebooks/tutorials/spatial_mapping.ipynb

Lines changed: 137 additions & 114 deletions
Large diffs are not rendered by default.

docs/notebooks/tutorials/spatial_smoothing.ipynb

Lines changed: 1212 additions & 0 deletions
Large diffs are not rendered by default.

docs/references.bib

Lines changed: 104 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -128,3 +128,107 @@ @article{pijuan2019single
128128
publisher={Nature Publishing Group UK London},
129129
url={https://www.nature.com/articles/s41586-019-0933-9},
130130
}
131+
132+
@article{varrone2024cellcharter,
133+
title={CellCharter reveals spatial cell niches associated with tissue remodeling and cell plasticity},
134+
author={Varrone, Marco and Tavernari, Daniele and Santamaria-Mart{\'\i}nez, Albert and Walsh, Logan A and Ciriello, Giovanni},
135+
journal={Nature genetics},
136+
volume={56},
137+
number={1},
138+
pages={74--84},
139+
year={2024},
140+
publisher={Nature Publishing Group US New York},
141+
url={https://www.nature.com/articles/s41588-023-01588-4},
142+
}
143+
144+
@article{kim2022unsupervised,
145+
title={Unsupervised discovery of tissue architecture in multiplexed imaging},
146+
author={Kim, Junbum and Rustam, Samir and Mosquera, Juan Miguel and Randell, Scott H and Shaykhiev, Renat and Rendeiro, Andr{\'e} F and Elemento, Olivier},
147+
journal={Nature methods},
148+
volume={19},
149+
number={12},
150+
pages={1653--1661},
151+
year={2022},
152+
publisher={Nature Publishing Group US New York},
153+
url={https://www.nature.com/articles/s41592-022-01657-2},
154+
}
155+
156+
@article{blampey2024sopa,
157+
title={Sopa: a technology-invariant pipeline for analyses of image-based spatial omics},
158+
author={Blampey, Quentin and Mulder, Kevin and Gardet, Margaux and Christodoulidis, Stergios and Dutertre, Charles-Antoine and Andr{\'e}, Fabrice and Ginhoux, Florent and Courn{\`e}de, Paul-Henry},
159+
journal={Nature Communications},
160+
volume={15},
161+
number={1},
162+
pages={4981},
163+
year={2024},
164+
publisher={Nature Publishing Group UK London},
165+
url={https://www.nature.com/articles/s41467-024-48981-z},
166+
}
167+
168+
@article{birk2025quantitative,
169+
title={Quantitative characterization of cell niches in spatially resolved omics data},
170+
author={Birk, Sebastian and Bonafonte-Pard{\`a}s, Irene and Feriz, Adib Miraki and Boxall, Adam and Agirre, Eneritz and Memi, Fani and Maguza, Anna and Yadav, Anamika and Armingol, Erick and Fan, Rong and others},
171+
journal={Nature Genetics},
172+
pages={1--13},
173+
year={2025},
174+
publisher={Nature Publishing Group US New York},
175+
url={https://www.nature.com/articles/s41588-025-02120-6},
176+
}
177+
178+
@article{xu2024unsupervised,
179+
title={Unsupervised spatially embedded deep representation of spatial transcriptomics},
180+
author={Xu, Hang and Fu, Huazhu and Long, Yahui and Ang, Kok Siong and Sethi, Raman and Chong, Kelvin and Li, Mengwei and Uddamvathanak, Rom and Lee, Hong Kai and Ling, Jingjing and others},
181+
journal={Genome Medicine},
182+
volume={16},
183+
number={1},
184+
pages={12},
185+
year={2024},
186+
publisher={Springer},
187+
url={https://link.springer.com/article/10.1186/s13073-024-01283-x},
188+
}
189+
190+
@article{zhao2021spatial,
191+
title={Spatial transcriptomics at subspot resolution with BayesSpace},
192+
author={Zhao, Edward and Stone, Matthew R and Ren, Xing and Guenthoer, Jamie and Smythe, Kimberly S and Pulliam, Thomas and Williams, Stephen R and Uytingco, Cedric R and Taylor, Sarah EB and Nghiem, Paul and others},
193+
journal={Nature biotechnology},
194+
volume={39},
195+
number={11},
196+
pages={1375--1384},
197+
year={2021},
198+
publisher={Nature Publishing Group US New York},
199+
url={https://www.nature.com/articles/s41587-021-00935-2},
200+
}
201+
202+
@inproceedings{li2024stargate,
203+
title={STARGATE: Spatial Transcriptomic Analysis with Recurrent and Graph Attention Techniques using Ensemble Learning},
204+
author={Li, Ning and Badai, Jiayidaer and Chen, Dengjie and Xiao, Ming and Zhang, Le},
205+
booktitle={2024 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)},
206+
pages={5630--5637},
207+
year={2024},
208+
organization={IEEE},
209+
url={https://ieeexplore.ieee.org/abstract/document/10822280},
210+
}
211+
212+
@article{palla2022squidpy,
213+
title={Squidpy: a scalable framework for spatial omics analysis},
214+
author={Palla, Giovanni and Spitzer, Hannah and Klein, Michal and Fischer, David and Schaar, Anna Christina and Kuemmerle, Louis Benedikt and Rybakov, Sergei and Ibarra, Ignacio L and Holmberg, Olle and Virshup, Isaac and others},
215+
journal={Nature methods},
216+
volume={19},
217+
number={2},
218+
pages={171--178},
219+
year={2022},
220+
publisher={Nature Publishing Group US New York},
221+
url={https://www.nature.com/articles/s41592-021-01358-2},
222+
}
223+
224+
@article{lopez2018deep,
225+
title={Deep generative modeling for single-cell transcriptomics},
226+
author={Lopez, Romain and Regier, Jeffrey and Cole, Michael B and Jordan, Michael I and Yosef, Nir},
227+
journal={Nature methods},
228+
volume={15},
229+
number={12},
230+
pages={1053--1058},
231+
year={2018},
232+
publisher={Nature Publishing Group US New York},
233+
url={https://www.nature.com/articles/s41592-018-0229-2},
234+
}

pyproject.toml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -61,8 +61,13 @@ optional-dependencies.test = [
6161
"squidpy",
6262
]
6363
optional-dependencies.tutorials = [
64+
"cellmapper",
6465
"harmony-pytorch",
66+
"netgraph",
67+
"python-louvain",
68+
"scvi-tools",
6569
"seaborn",
70+
"sopa",
6671
"squidpy",
6772
]
6873

src/cellmapper/evaluate.py

Lines changed: 63 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -69,19 +69,72 @@ def zscore(x):
6969
class CellMapperEvaluationMixin:
7070
"""Mixin class for evaluation-related methods for CellMapper."""
7171

72+
def register_external_predictions(
73+
self, label_key: str, prediction_postfix: str = "pred", confidence_postfix: str = "conf"
74+
) -> None:
75+
"""
76+
Register externally computed predictions for evaluation.
77+
78+
Parameters
79+
----------
80+
label_key
81+
Base key in .obs for the label (e.g., 'cell_type').
82+
prediction_postfix
83+
Postfix for prediction column in .obs (e.g., 'pred').
84+
The full column name should be f"{label_key}_{prediction_postfix}".
85+
confidence_postfix
86+
Postfix for confidence column in .obs (e.g., 'conf').
87+
The full column name should be f"{label_key}_{confidence_postfix}".
88+
89+
Returns
90+
-------
91+
None
92+
93+
Notes
94+
-----
95+
Updates the following attributes:
96+
97+
- ``prediction_postfix``: Postfix for prediction column.
98+
- ``confidence_postfix``: Postfix for confidence column.
99+
"""
100+
# Verify that the expected columns exist
101+
pred_col = f"{label_key}_{prediction_postfix}"
102+
conf_col = f"{label_key}_{confidence_postfix}"
103+
104+
if pred_col not in self.query.obs.columns:
105+
raise ValueError(f"Prediction column '{pred_col}' not found in query.obs")
106+
if conf_col not in self.query.obs.columns:
107+
raise ValueError(f"Confidence column '{conf_col}' not found in query.obs")
108+
109+
# Register the postfixes
110+
self.prediction_postfix = prediction_postfix
111+
self.confidence_postfix = confidence_postfix
112+
113+
logger.info(
114+
"External predictions registered with prediction_postfix='%s' and confidence_postfix='%s'",
115+
prediction_postfix,
116+
confidence_postfix,
117+
)
118+
72119
def evaluate_label_transfer(
73120
self,
74121
label_key: str,
122+
prediction_postfix: str | None = None,
123+
confidence_postfix: str | None = None,
75124
confidence_cutoff: float = 0.0,
76125
zero_division: int | Literal["warn"] = 0,
77126
) -> None:
78127
"""
79-
Evaluate label transfer using a k-NN classifier.
128+
Evaluate label transfer using a k-NN classifier or externally computed predictions.
80129
81130
Parameters
82131
----------
83132
label_key
84133
Key in .obs storing ground-truth cell type annotations.
134+
prediction_postfix
135+
Postfix for prediction column in .obs. If None, uses self.prediction_postfix.
136+
confidence_postfix
137+
Postfix for confidence column in .obs. If None, uses self.confidence_postfix.
85138
confidence_cutoff
86139
Minimum confidence score required to include a cell in the evaluation.
87140
zero_division
@@ -97,8 +150,15 @@ def evaluate_label_transfer(
97150
98151
- ``label_transfer_metrics``: Dictionary containing accuracy, precision, recall, F1 scores, and excluded fraction.
99152
"""
100-
if self.prediction_postfix is None or self.confidence_postfix is None:
101-
raise ValueError("Label transfer has not been performed. Call transfer_labels() first.")
153+
# Use provided postfixes if given, otherwise fall back to instance attributes
154+
pred_postfix = prediction_postfix or self.prediction_postfix
155+
conf_postfix = confidence_postfix or self.confidence_postfix
156+
157+
if pred_postfix is None or conf_postfix is None:
158+
raise ValueError(
159+
"Label transfer has not been performed. Either call transfer_labels() first "
160+
"or provide prediction_postfix and confidence_postfix parameters."
161+
)
102162

103163
# Extract ground-truth and predicted labels
104164
y_true = self.query.obs[label_key].dropna()

0 commit comments

Comments
 (0)