Skip to content

Commit 2ebd2e3

Browse files
authored
Merge pull request #53 from HiDiHlabs/restructure
Restructure codebase
2 parents 780ecf5 + 33cd947 commit 2ebd2e3

File tree

18 files changed

+1940
-2439
lines changed

18 files changed

+1940
-2439
lines changed

.pre-commit-config.yaml

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,3 +21,16 @@ repos:
2121
- id: ruff
2222
# Formatter
2323
- id: ruff-format
24+
- repo: https://github.com/pre-commit/mirrors-mypy
25+
rev: v1.15.0
26+
hooks:
27+
- id: mypy
28+
additional_dependencies:
29+
- "numpy"
30+
- repo: https://github.com/codespell-project/codespell
31+
rev: v2.4.1
32+
hooks:
33+
- id: codespell
34+
exclude_types: ["jupyter", "svg"]
35+
additional_dependencies:
36+
- tomli

README.md

Lines changed: 17 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -37,8 +37,8 @@ In a first step, we define a number of parameters for the analysis:
3737
import pandas as pd
3838
import ovrlpy
3939

40-
# define ovrlpy analysis parameters:
41-
n_expected_celltypes = 20
40+
# define ovrlpy analysis parameters
41+
n_components = 20
4242

4343
# load the data
4444
coordinate_df = pd.read_csv('path/to/coordinate_file.csv')
@@ -51,53 +51,46 @@ you can then fit an ovrlpy model to the data and create a signal integrity map:
5151

5252
```python
5353
# fit the ovrlpy model to the data
54-
signal_integrity, signal_strength, visualizer = ovrlpy.run(
55-
coordinate_df, n_expected_celltypes=n_expected_celltypes
54+
dataset = ovrlpy.Ovrlp(
55+
coordinate_df,
56+
n_components=n_components,
57+
n_workers=4, # number of threads to use for processing
5658
)
59+
60+
dataset.analyse()
5761
```
5862

59-
returns a signal integrity map, a signal map and a visualizer object that can be used to visualize the data:
63+
after fitting we can visualize the data ...
6064

6165
```python
62-
visualizer.plot_fit()
66+
fig = ovrlpy.plot_pseudocells(dataset)
6367
```
6468
![plot_fit output](docs/resources/plot_fit.png)
6569

6670

67-
and visualize the signal integrity map:
71+
... and the signal integrity map
6872

6973
```python
70-
fig, ax = ovrlpy.plot_signal_integrity(signal_integrity, signal_strength, signal_threshold=4)
74+
fig = ovrlpy.plot_signal_integrity(dataset, signal_threshold=4)
7175
```
7276

7377
![plot_signal_integrity output](docs/resources/xenium_integrity_with_highlights.svg)
7478

7579
Ovrlpy can also identify individual overlap events in the data:
7680

7781
```python
78-
doublet_df = ovrlpy.detect_doublets(
79-
signal_integrity, signal_strength, minimum_signal_strength=3, integrity_sigma=2
80-
)
81-
82-
doublet_df.head()
82+
doublets = dataset.detect_doublets(min_signal=4, integrity_sigma=1)
8383
```
8484

85-
And use the visualizer to show a 3D visualization of the overlaps in the tissue:
85+
And plot a multi-view visualization of the overlaps in the tissue:
8686

8787
```python
8888
# Which doublet do you want to visualize?
89-
n_doublet_case = 0
89+
doublet_to_show = 0
9090

91-
x, y = doublet_df.loc[doublet_case, ["x", "y"]]
91+
x, y = doublets["x", "y"].row(doublet_to_show)
9292

93-
ovrlpy.plot_region_of_interest(
94-
x,
95-
y,
96-
coordinate_df,
97-
visualizer,
98-
signal_integrity,
99-
signal_strength,
100-
)
93+
fig = ovrlpy.plot_region_of_interest(dataset, x, y, window_size=window_size)
10194
```
10295

10396
![plot_region_of_interest output](docs/resources/plot_roi.png)

docs/source/conf.py

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -52,15 +52,20 @@
5252

5353
nitpicky = True
5454
nitpick_ignore = [
55+
("py:class", "numpy.typing.DTypeLike"),
56+
("py:class", "polars.DataFrame"),
57+
("py:class", "polars.DataType"),
58+
("py:class", "umap.UMAP"),
5559
("py:class", "optional"),
5660
]
5761

5862
intersphinx_mapping = dict(
63+
anndata=("https://anndata.readthedocs.io/en/stable/", None),
5964
matplotlib=("https://matplotlib.org/stable/", None),
6065
numpy=("https://numpy.org/doc/stable/", None),
6166
pandas=("https://pandas.pydata.org/pandas-docs/stable/", None),
67+
polars=("https://docs.pola.rs/api/python/stable/", None),
6268
python=("https://docs.python.org/3", None),
63-
scanpy=("https://scanpy.readthedocs.io/en/stable/", None),
6469
scipy=("https://docs.scipy.org/doc/scipy/", None),
6570
sklearn=("https://scikit-learn.org/stable/", None),
6671
umap=("https://umap-learn.readthedocs.io/page/", None),

docs/source/tutorials/vizgen_liver.ipynb

Lines changed: 78 additions & 221 deletions
Large diffs are not rendered by default.

docs/source/tutorials/xenium_brain.ipynb

Lines changed: 131 additions & 200 deletions
Large diffs are not rendered by default.

docs/source/usage.rst

Lines changed: 21 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -17,16 +17,16 @@ Functions to read the data in the correct format are available for common file f
1717
import ovrlpy
1818
1919
# Define analysis parameters for ovrlpy
20-
kde_bandwidth = 2 # The smoothness of the kernel density estimation (KDE)
21-
n_expected_celltypes = 20 # Number of expected cell types in the data
20+
kde_bandwidth = 2.5 # smoothness of the kernel density estimation (KDE)
21+
n_components = 20 # number of principal components, depends on the data complexity
2222
2323
# Load your spatial transcriptomics data from a CSV file
2424
coordinate_df = pd.read_csv('path/to/coordinate_file.csv')
2525
2626
2727
In this step, we load the dataset and configure the model parameters, such as
2828
`kde_bandwidth` (to control smoothness) and
29-
`n_expected_celltypes` (to set the expected number of cell types).
29+
`n_components` (to set the number of prinicpal components that will be used).
3030

3131
2. Fit the ovrlpy Model
3232
_______________________
@@ -36,17 +36,13 @@ Fit the **ovrlpy** model to generate the signal integrity map.
3636
.. code-block:: python
3737
3838
# Fit the ovrlpy model to the spatial data
39-
integrity, signal, visualizer = ovrlpy.run(
40-
df=coordinate_df,
39+
dataset = ovrlpy.Ovrlp(
40+
coordinate_df,
4141
KDE_bandwidth=kde_bandwidth,
42-
n_expected_celltypes=n_expected_celltypes
42+
n_components=n_components,
43+
n_workers=4, # number of threads to use for processing
4344
)
44-
45-
This function generates:
46-
47-
- **integrity**: The signal integrity map.
48-
- **signal**: The signal map representing the strength of spatial expression signals.
49-
- **visualizer**: A visualizer object that helps to plot and explore the results.
45+
dataset.analyse()
5046
5147
3. Visualize the Model Fit
5248
__________________________
@@ -55,8 +51,7 @@ Once the model is fitted, you can visualize how well it matches your spatial dat
5551

5652
.. code-block:: python
5753
58-
# Use the visualizer object to plot the fitted signal map
59-
visualizer.plot_fit()
54+
fig = ovrlpy.plot_pseudocells(dataset)
6055
6156
This plot gives you a visual representation of the models fit to the spatial transcriptomics data.
6257

@@ -67,8 +62,7 @@ Now, plot the signal integrity map using a threshold to highlight areas with str
6762

6863
.. code-block:: python
6964
70-
# Plot the signal integrity map with a signal threshold
71-
fig, ax = ovrlpy.plot_signal_integrity(integrity, signal, signal_threshold=4.0)
65+
fig = ovrlpy.plot_signal_integrity(dataset, signal_threshold=4)
7266
7367
7468
5. Detect and Visualize Overlaps (Doublets)
@@ -79,39 +73,29 @@ Identify overlapping signals (doublets) in the tissue and visualize them.
7973
.. code-block:: python
8074
8175
# Detect doublet events (overlapping signals) in the dataset
82-
doublet_df = ovrlpy.detect_doublets(
83-
integrity,
84-
signal,
85-
signal_cutoff=4, # Threshold for signal strength
86-
integrity_sigma=1 # Controls the coherence of the signals
76+
doublets = dataset.detect_doublets(
77+
min_signal=4, # threshold for signal strength
78+
integrity_sigma=1, # controls the coherence of the signals
8779
)
8880
89-
# Display the detected doublets
90-
doublet_df.head()
81+
doublets.head()
9182
9283
6. 3D Visualization of a Doublet Event
9384
______________________________________
9485

95-
Visualize a specific overlap event (doublet) in 3D to see how it looks in the tissue.
86+
Visualize a specific overlap event (doublet) to see how it looks in the tissue.
9687

9788
.. code-block:: python
9889
99-
# Parameters for 3D visualization
90+
# Parameters for the visualization
10091
window_size = 60 # Size of the visualization window around the doublet
10192
doublet_to_show = 0 # Index of the doublet to visualize
10293
103-
# Get the coordinates of the doublet event
104-
x, y = doublet_df.loc[doublet_to_show, ["x", "y"]]
94+
# Coordinates of the doublet event
95+
x, y = doublets["x", "y"].row(doublet_to_show)
10596
10697
# Plot the doublet event with 3D visualization
107-
_ = ovrlpy.plot_region_of_interest(
108-
x, y,
109-
coordinate_df,
110-
visualizer,
111-
signal_integrity,
112-
signal_strength,
113-
window_size=window_size,
114-
)
98+
fig = ovrlpy.plot_region_of_interest(dataset, x, y, window_size=window_size)
11599
116-
This visualization shows a 3D representation of the spatial overlap event, giving more
117-
insight into the structure and coherence of the signals.
100+
This visualization shows a top/bottom/side representation of the spatial overlap event,
101+
giving more insight into the structure and coherence of the signals.

ovrlpy/__init__.py

Lines changed: 13 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -1,18 +1,17 @@
11
from importlib.metadata import PackageNotFoundError, version
22

33
from . import io
4-
from ._ovrlp import (
5-
Visualizer,
6-
compute_VSI,
7-
detect_doublets,
8-
get_pseudocell_locations,
4+
from ._ovrlp import Ovrlp
5+
from ._plotting import (
6+
SCALEBAR_PARAMS,
7+
plot_pseudocells,
98
plot_region_of_interest,
109
plot_signal_integrity,
11-
pre_process_coordinates,
12-
run,
13-
sample_expression_at_xy,
10+
plot_tissue,
11+
plot_umap,
1412
)
15-
from ._utils import SCALEBAR_PARAMS, UMAP_2D_PARAMS, UMAP_RGB_PARAMS
13+
from ._subslicing import process_coordinates
14+
from ._utils import UMAP_2D_PARAMS, UMAP_RGB_PARAMS
1615

1716
try:
1817
__version__ = version("ovrlpy")
@@ -24,15 +23,13 @@
2423

2524
__all__ = [
2625
"io",
27-
"compute_VSI",
28-
"detect_doublets",
29-
"sample_expression_at_xy",
30-
"get_pseudocell_locations",
26+
"Ovrlp",
27+
"plot_pseudocells",
3128
"plot_region_of_interest",
3229
"plot_signal_integrity",
33-
"pre_process_coordinates",
34-
"Visualizer",
35-
"run",
30+
"plot_tissue",
31+
"plot_umap",
32+
"process_coordinates",
3633
"SCALEBAR_PARAMS",
3734
"UMAP_2D_PARAMS",
3835
"UMAP_RGB_PARAMS",

0 commit comments

Comments
 (0)