Skip to content

Commit 6c461b6

Browse files
authored
Merge pull request #15 from quadbio/tests/extend
Extend tests & update the README
2 parents 422d688 + fb3f9e9 commit 6c461b6

File tree

8 files changed

+146
-22
lines changed

8 files changed

+146
-22
lines changed

CHANGELOG.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,9 @@ and this project adheres to [Semantic Versioning][].
1010

1111
## [Unreleased]
1212

13+
### Added
14+
- Included tests for the `check` module, and more tests for the main classes.
15+
1316
## [v0.1.1]
1417

1518
### Changed

LICENSE

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
MIT License
22

3-
Copyright (c) 2025, Marius Lange
3+
Copyright (c) 2025, QuaDBioLab
44

55
Permission is hereby granted, free of charge, to any person obtaining a copy
66
of this software and associated documentation files (the "Software"), to deal

README.md

Lines changed: 40 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -10,34 +10,46 @@
1010
[badge-pre-commit]: https://results.pre-commit.ci/badge/github/quadbio/cellmapper/main.svg
1111
[badge-pypi]: https://img.shields.io/pypi/v/cellmapper.svg
1212

13-
k-NN-based mapping of cells across representations to tranfer labels, embeddings and expression values. Works for millions of cells, on CPU and GPU, across molecular modalities, between spatial and non-spatial data, for arbitrary query and reference datasets. Using `faiss` to compute k-NN graphs, CellMapper takes about 30 seconds to transfer cell type labels from 1.5M cells to 1.5M cells on a single RTX 4090 with 60 GB CPU memory.
13+
k-NN-based mapping of cells across representations to tranfer labels, embeddings and expression values. Works for millions of cells, on CPU and GPU, across molecular modalities, between spatial and non-spatial data, for arbitrary query and reference datasets. Using [faiss][] to compute k-NN graphs, CellMapper takes about 30 seconds to transfer cell type labels from 1.5M cells to 1.5M cells on a single RTX 4090 with 60 GB CPU memory.
1414

15-
## Getting started
16-
17-
Please refer to the [documentation][],
18-
in particular, the [API documentation][].
15+
Inspired by scanpy's [ingest][] and the [HNOCA-tools][] packages.
1916

2017
## Installation
2118

2219
You need to have Python 3.10 or newer installed on your system.
2320
If you don't have Python installed, we recommend installing [uv][].
2421

25-
There are several alternative options to install cellmapper:
22+
There are two alternative options to install ``cellmapper``:
2623

27-
<!--
28-
1) Install the latest release of `cellmapper` from [PyPI][]:
24+
- **Install the latest release from [PyPI][]**:
2925

30-
```bash
31-
pip install cellmapper
32-
```
33-
-->
26+
```bash
27+
pip install cellmapper
28+
```
29+
30+
- **Install the latest development version**:
31+
32+
```bash
33+
pip install git+https://github.com/quadbio/cellmapper.git@main
34+
```
3435

35-
1. Install the latest development version:
36+
## Getting started
37+
38+
This package assumes that you have ``ref`` and ``query`` AnnData objects, with a joint embedding computed and stored in ``.obsm``. We explicilty do not compute this joint embedding, but there are plenty of method you can use to get such joint embeddings, e.g. [GimVI][] or [ENVI][] for spatial mapping, [GLUE][], [MIDAS][] and [MOFA+][] for modality translation, and [scVI][], [scANVI][] and [scArches][] for query-to-reference mapping - this is just a small selection!
39+
40+
With a joint embedding in ``.obsm["X_joint"]`` at hand, the simplest way to use ``CellMapper`` is as follows:
41+
```Python
42+
from cellmapper import CellMapper
3643

37-
```bash
38-
pip install git+https://github.com/quadbio/cellmapper.git@main
44+
cmap = CellMapper(ref, query).fit(
45+
use_rep="X_joint", obs_keys="celltype", obsm_keys="X_umap", layer_key="X"
46+
)
3947
```
4048

49+
This will transfer data from the reference to the query dataset, including celltype labels stored in ``ref.obs``, a UMAP embedding stored in ``ref.obsm``, and expression values stored in ``ref.X``.
50+
51+
There are many ways to customize this, e.g. use different ways to compute k-NN graphs and to turn them into mapping matrices, and we implement a few methods to evaluate whether your k-NN transfer was sucessful.
52+
4153
## Release notes
4254

4355
See the [changelog][].
@@ -59,3 +71,16 @@ Please cite this GitHub repo if you find CellMapper useful for your research.
5971
[coverage]: https://codecov.io/gh/quadbio/cellmapper
6072
[pre-commit]: https://results.pre-commit.ci/latest/github/quadbio/cellmapper/main
6173
[pypi]: https://pypi.org/project/cellmapper/
74+
[faiss]: https://github.com/facebookresearch/faiss
75+
76+
[ingest]: https://scanpy.readthedocs.io/en/stable/generated/scanpy.tl.ingest.html
77+
[HNOCA-tools]: https://devsystemslab.github.io/HNOCA-tools/
78+
79+
[GimVI]: https://docs.scvi-tools.org/en/stable/api/reference/scvi.external.GIMVI.html#
80+
[ENVI]: https://scenvi.readthedocs.io/en/latest/#
81+
[GLUE]: https://scglue.readthedocs.io/en/latest/
82+
[MIDAS]: https://scmidas.readthedocs.io/en/latest/
83+
[MOFA+]: https://muon.readthedocs.io/en/latest/omics/multi.html
84+
[scVI]: https://docs.scvi-tools.org/en/stable/api/reference/scvi.model.SCVI.html
85+
[scANVI]: https://docs.scvi-tools.org/en/stable/api/reference/scvi.model.SCANVI.html
86+
[scArches]: https://docs.scarches.org/en/latest/

src/cellmapper/check.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -79,4 +79,6 @@ def check_deps(*args) -> None:
7979
A list of dependencies to check
8080
"""
8181
for item in args:
82+
if item not in CHECKERS:
83+
raise RuntimeError(f"Dependency '{item}' is not registered in CHECKERS.")
8284
CHECKERS[item].check()

src/cellmapper/knn.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -118,7 +118,9 @@ def knn_graph_connectivities(
118118
epsilon = kwargs.get("epsilon", 1e-8)
119119
connectivities = 1.0 / (self.distances + epsilon)
120120
else:
121-
raise ValueError(f"Unknown kernel: {kernel}. Supported kernels are 'gaussian' and 'scarches'.")
121+
raise ValueError(
122+
f"Unknown kernel: {kernel}. Supported kernels are: 'gaussian', 'scarches', 'random', 'inverse_distance'."
123+
)
122124
rowptr = np.arange(0, self.n_samples * self.n_neighbors + 1, self.n_neighbors)
123125
return csr_matrix((connectivities.ravel().astype(dtype), self.indices.ravel(), rowptr), shape=self.shape)
124126

tests/test_cellmapper.py

Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,8 @@
11
import numpy as np
22
import pytest
33

4+
from cellmapper.cellmapper import CellMapper
5+
46

57
def assert_metrics_close(actual: dict, expected: dict, atol=1e-3):
68
for key, exp in expected.items():
@@ -66,3 +68,48 @@ def test_compute_neighbors_joint_pca(self, cmap, joint_pca_key, n_pca_components
6668
assert joint_pca_key in cmap.query.obsm
6769
assert cmap.ref.obsm[joint_pca_key].shape[1] == n_pca_components
6870
assert cmap.query.obsm[joint_pca_key].shape[1] == n_pca_components
71+
72+
@pytest.mark.parametrize(
73+
"obs_keys,obsm_keys,layer_key",
74+
[
75+
("leiden", None, None),
76+
(None, "X_pca", None),
77+
(None, None, "X"),
78+
("leiden", "X_pca", None),
79+
("leiden", None, "X"),
80+
(None, "X_pca", "X"),
81+
("leiden", "X_pca", "X"),
82+
],
83+
)
84+
def test_fit_various_combinations(self, cmap, obs_keys, obsm_keys, layer_key):
85+
cmap.fit(obs_keys=obs_keys, obsm_keys=obsm_keys, layer_key=layer_key)
86+
if obs_keys is not None:
87+
keys = [obs_keys] if isinstance(obs_keys, str) else obs_keys
88+
for key in keys:
89+
assert f"{key}_pred" in cmap.query.obs
90+
if obsm_keys is not None:
91+
keys = [obsm_keys] if isinstance(obsm_keys, str) else obsm_keys
92+
for key in keys:
93+
assert f"{key}_pred" in cmap.query.obsm
94+
if layer_key is not None:
95+
assert cmap.query_imputed is not None
96+
assert cmap.query_imputed.X.shape[0] == cmap.query.n_obs
97+
98+
def test_transfer_labels_self_mapping(self, query_ref_adata):
99+
"""Check mapping to self."""
100+
_, ref = query_ref_adata
101+
cm = CellMapper(ref, ref)
102+
cm.fit(
103+
knn_method="sklearn",
104+
mapping_method="jaccard",
105+
obs_keys="leiden",
106+
use_rep="X_pca",
107+
n_neighbors=1,
108+
prediction_postfix="transfer",
109+
)
110+
assert "leiden_transfer" in ref.obs
111+
assert len(ref.obs["leiden_transfer"]) == len(ref.obs["leiden"])
112+
# Check that all predicted labels are valid categories
113+
assert set(ref.obs["leiden_transfer"].cat.categories) <= set(ref.obs["leiden"].cat.categories)
114+
# If mapping to self, labels should match
115+
assert ref.obs["leiden_transfer"].equals(ref.obs["leiden"])

tests/test_check.py

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
import packaging
2+
import pytest
3+
4+
from cellmapper import check
5+
from cellmapper.check import Checker, check_deps
6+
7+
8+
class TestCheck:
9+
def test_checker_available_module(self):
10+
# Should not raise for a real installable package
11+
Checker("packaging").check()
12+
13+
def test_checker_missing_module(self):
14+
# Should raise RuntimeError for a missing module
15+
with pytest.raises(RuntimeError):
16+
Checker("not_a_real_module").check()
17+
18+
def test_checker_version_requirement(self):
19+
# Should raise if vmin is higher than installed version
20+
installed_version = packaging.version.parse(packaging.__version__)
21+
higher_version = str(installed_version.major + 1) + ".0.0"
22+
with pytest.raises(RuntimeError):
23+
Checker("packaging", vmin=higher_version).check()
24+
25+
def test_check_deps_missing(self):
26+
# Should raise for a missing dependency (not registered in CHECKERS)
27+
with pytest.raises(RuntimeError):
28+
check_deps("not_a_real_module")
29+
30+
def test_check_deps_available(self):
31+
# Should not raise for a real installable package
32+
check.CHECKERS["packaging"] = Checker("packaging")
33+
check_deps("packaging")
34+
del check.CHECKERS["packaging"]

tests/test_neighbors.py

Lines changed: 16 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
import numpy as np
2+
import pytest
23

34
from cellmapper.knn import Neighbors
45

@@ -11,17 +12,27 @@ def assert_adjacency_equal(neigh1, neigh2, attrs=("xx", "yy", "xy", "yx")):
1112

1213

1314
class TestNeighbors:
14-
def test_neighbors_sklearn_vs_pynndescent(self, small_data):
15+
@pytest.mark.parametrize("only_yx", [False, True])
16+
def test_neighbors_sklearn_vs_pynndescent(self, small_data, only_yx):
1517
x, y = small_data
1618
n_neighbors = 3
1719
# sklearn
1820
neigh_skl = Neighbors(x, y)
19-
neigh_skl.compute_neighbors(n_neighbors=n_neighbors, method="sklearn")
21+
neigh_skl.compute_neighbors(n_neighbors=n_neighbors, method="sklearn", only_yx=only_yx)
2022
# pynndescent
2123
neigh_pynn = Neighbors(x, y)
22-
neigh_pynn.compute_neighbors(n_neighbors=n_neighbors, method="pynndescent")
23-
# Compare adjacency matrices
24-
assert_adjacency_equal(neigh_skl, neigh_pynn)
24+
neigh_pynn.compute_neighbors(n_neighbors=n_neighbors, method="pynndescent", only_yx=only_yx)
25+
if only_yx:
26+
with pytest.raises(ValueError):
27+
neigh_skl.get_adjacency_matrices()
28+
with pytest.raises(ValueError):
29+
neigh_pynn.get_adjacency_matrices()
30+
else:
31+
assert_adjacency_equal(neigh_skl, neigh_pynn)
32+
# Always compare connectivities (yx)
33+
conn_skl = neigh_skl.yx.knn_graph_connectivities()
34+
conn_pynn = neigh_pynn.yx.knn_graph_connectivities()
35+
assert np.allclose(conn_skl.toarray(), conn_pynn.toarray(), atol=1e-6)
2536

2637
def test_neighbors_repr(self, small_data):
2738
x, y = small_data

0 commit comments

Comments
 (0)