Skip to content

Commit 142e0f5

Browse files
dariarom94LouisK92
andauthored
add singler python implementation (#118)
* add singler python implementation * script to run the benchmark * workflow file * workflow configuration * improvements * Adjust docs and repo in singler --------- Co-authored-by: LouisK92 <[email protected]>
1 parent 34b7908 commit 142e0f5

File tree

8 files changed

+99
-2
lines changed

8 files changed

+99
-2
lines changed

scripts/run_benchmark/run_full_local.sh

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -59,6 +59,7 @@ celltype_annotation_methods:
5959
# - moscot
6060
# - mapmycells
6161
# - tangram
62+
# - singler
6263
expression_correction_methods:
6364
- no_correction
6465
# - gene_efficiency_correction

scripts/run_benchmark/run_full_seqeracloud.sh

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,7 @@ celltype_annotation_methods:
5151
- moscot
5252
- mapmycells
5353
- tangram
54+
- singler
5455
expression_correction_methods:
5556
- no_correction
5657
- gene_efficiency_correction

scripts/run_benchmark/run_test_local.sh

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -54,6 +54,7 @@ celltype_annotation_methods:
5454
# - moscot
5555
# - mapmycells
5656
# - tangram
57+
# - singler
5758
expression_correction_methods:
5859
- no_correction
5960
# - gene_efficiency_correction

scripts/run_benchmark/run_test_seqeracloud.sh

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -50,6 +50,7 @@ celltype_annotation_methods:
5050
- moscot
5151
- mapmycells
5252
- tangram
53+
- singler
5354
expression_correction_methods:
5455
- no_correction
5556
- gene_efficiency_correction
Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
__merge__: /src/api/comp_method_cell_type_annotation.yaml
2+
3+
name: singler
4+
label: "singler"
5+
summary: "Cell type annotations using single-cell reference with SingleR"
6+
description: "Cell type annotations using single-cell reference with SingleR"
7+
8+
links:
9+
documentation: "https://github.com/SingleR-inc/singler-py"
10+
repository: "https://github.com/SingleR-inc/singler-py"
11+
references:
12+
doi: "10.1038/s41590-018-0276-y"
13+
14+
arguments:
15+
- name: --labels_key
16+
type: string
17+
description: The key of the cell labels in the input data.
18+
default: cell_labels
19+
20+
resources:
21+
- type: python_script
22+
path: script.py
23+
24+
engines:
25+
- type: docker
26+
image: openproblems/base_python:1
27+
setup:
28+
- type: python
29+
pypi: [singler]
30+
__merge__:
31+
- /src/base/setup_spatialdata_partial.yaml
32+
- type: native
33+
34+
runners:
35+
- type: executable
36+
- type: nextflow
37+
directives:
38+
label: [ midtime, midcpu, midmem ]
Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,53 @@
1+
import anndata as ad
2+
import os
3+
import shutil
4+
5+
import singlecellexperiment as sce
6+
import singler
7+
8+
## VIASH START
9+
# The following code has been auto-generated by Viash.
10+
par = {
11+
'input_spatial_normalized_counts': r'resources_test/task_ist_preprocessing/mouse_brain_combined/spatial_normalized_counts.h5ad',
12+
'input_transcript_assignments': r'resources_test/task_ist_preprocessing/mouse_brain_combined/transcript_assignments.zarr',
13+
'input_scrnaseq_reference': r'resources_test/task_ist_preprocessing/mouse_brain_combined/scrnaseq_reference.h5ad',
14+
'celltype_key': r'cell_type',
15+
'output': r'resources_test/task_ist_preprocessing/mouse_brain_combined/spatial_with_cell_types.h5ad',
16+
'labels_key': r'cell_labels'
17+
}
18+
meta = {
19+
'name': r'singleR',
20+
'functionality_name': r'singleR'
21+
}
22+
dep = {
23+
24+
}
25+
26+
## VIASH END
27+
sce_h5ad = sce.read_h5ad(par['input_spatial_normalized_counts'])
28+
adata_sp = ad.read_h5ad(par['input_spatial_normalized_counts'])
29+
30+
sce_ref = sce.read_h5ad(par['input_scrnaseq_reference'])
31+
32+
features = [str(x) for x in sce_h5ad.row_data.row_names]
33+
34+
mat = sce_h5ad.assay("counts") ##example has raw, not sure
35+
mat = mat.sorted_indices() ## magic line to make sure the matrix is in the right format for SingleR
36+
37+
mat_ref = sce_ref.assay("normalized")
38+
mat_ref = mat_ref.sorted_indices() ## magic line to make sure the matrix is in the right format for SingleR
39+
40+
## create the reference from our sc data
41+
built = singler.train_single(ref_data = mat_ref,
42+
ref_labels = sce_ref.get_column_data().column("cell_type"),
43+
ref_features = sce_ref.get_row_names(),
44+
test_features = features,)
45+
46+
## annotate the dataset
47+
output = singler.classify_single(mat, ref_prebuilt=built)
48+
49+
adata_sp.obs["cell_type"] = output['best']
50+
51+
# Write output
52+
print('Writing output', flush=True)
53+
adata_sp.write(par['output'])

src/workflows/run_benchmark/config.vsh.yaml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -98,7 +98,7 @@ argument_groups:
9898
A list of cell type annotation methods to run.
9999
type: string
100100
multiple: true
101-
default: "ssam:tacco:moscot:mapmycells:tangram"
101+
default: "ssam:tacco:moscot:mapmycells:tangram:singler"
102102
- name: "--expression_correction_methods"
103103
description: |
104104
A list of expression correction methods to run.
@@ -170,6 +170,7 @@ dependencies:
170170
- name: methods_cell_type_annotation/moscot
171171
- name: methods_cell_type_annotation/mapmycells
172172
- name: methods_cell_type_annotation/tangram
173+
- name: methods_cell_type_annotation/singler
173174
- name: methods_expression_correction/no_correction
174175
- name: methods_expression_correction/gene_efficiency_correction
175176
- name: methods_expression_correction/resolvi_correction

src/workflows/run_benchmark/main.nf

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -376,7 +376,8 @@ workflow run_wf {
376376
tacco,
377377
moscot,
378378
mapmycells,
379-
tangram
379+
tangram,
380+
singler
380381
]
381382

382383
cta_ch = normalization_ch

0 commit comments

Comments
 (0)