Add module for running SCimilarity by allyhawkins · Pull Request #175 · AlexsLemonade/OpenScPCA-nf

allyhawkins · 2025-08-27T18:53:08Z

Closes #170

Here I'm adding the module to run SCimilarity on all of the samples. There's only one step here which is just to run SCimilarity and output the annotations as a TSV file. I copied the script that does this from OpenScPCA-analysis without any modifications so the main code to review here is the addition to Nextflow.

There are two new parameters, one for the model itself and one for the ontology map file that we created in OpenScPCA-analysis. Since the model file is quite big, I added an empty folder for stub testing and added that path to the stub profile.
This module runs on all processed h5ad files for RNA only, so I do have a step that should filter out any adt files.

I am filing this as a draft because I'm having issues getting Nextflow to run the script in the container inside the conda environment. The way that Nextflow launches and runs the image means the conda environment that's installed isn't getting used by default and so it can't find the packages we use in the script. I think I found a solution to when we build the environment to set the default python to the conda environment so that it will work with Nextflow that I'll file as a PR in OpenScPCA-analysis.

Once I'm able to confirm this runs, then I'll request formal review.

…larity

This reverts commit 5a7c1dc.

This reverts commit c3829e8.

allyhawkins · 2025-08-28T14:35:11Z

I ran through the whole workflow with the simulated data successfully, so this is now ready for review.

sjspielman

This looks good to me! Not much to say about the nextflow code for a pretty standard "one process to do the thing" situation, and you got this memo ✅ #178 (comment)

FYI, I didn't carefully review the Python code since I'm assuming it was well-reviewed nextdoor in the analysis repo, let me know if you want me to have a closer look anywhere?

So in the end, my main comment is to restore the workflow bits you commented out during testing.

sjspielman · 2025-08-28T15:15:46Z

config/module_params.config


+  // cell type scimilarity
+  cell_type_scimilarity_model = 's3://scpca-references/celltype/scimilarity_references/model_v1.1'
+  cell_type_scimilarity_ontology_ref_file = 'https://raw.githubusercontent.com/AlexsLemonade/OpenScPCA-analysis/refs/heads/main/analyses/cell-type-scimilarity/references/scimilarity-mapped-ontologies.tsv'


noting we'll want to update this one too with a tagged link, same as my NB urls above

sjspielman · 2025-08-28T15:16:00Z

main.nf


  // Run the merge workflow
-  merge_sce(sample_ch)
+  //merge_sce(sample_ch)


Ah, workflow testing 😬

modules/cell-type-scimilarity/README.md

sjspielman · 2025-08-28T15:17:43Z

modules/cell-type-scimilarity/main.nf

+        --processed_h5ad_file \$file \
+        --ontology_map_file ${ontology_map_file} \
+        --predictions_tsv \$(basename \${file%_rna.h5ad}_scimilarity-celltype-assignments.tsv.gz) \
+        --seed 2025


This is assigned in the file so you probably don't need it, but doesn't

sjspielman · 2025-08-28T15:19:42Z

modules/cell-type-scimilarity/resources/usr/bin/run-scimilarity.py

+        {
+            "barcode": processed_anndata.obs_names.to_list(),
+            "scimilarity_celltype_annotation": predictions.values,
+            "min_dist": nn_stats["min_dist"],


Do we need this column in this workflow?

Yes! This is a stat we are going to use to measure confidence, as recommended by SCimilarity docs, so we want to output it so we can use it for exploratory analysis.

Co-authored-by: Stephanie Spielman <stephanie.spielman@gmail.com>

allyhawkins · 2025-08-28T18:26:51Z

@sjspielman I ran this through on the real data and all samples completed successfully. I also checked and the results files are now present in the staging bucket.

I restored running the other modules and added a TODO about updating to use the tagged link. This should be ready for another look.

Edit: I meant to say that you do not need to review the python script since it is copied exactly from the script that was reviewed in the analysis repo.

sjspielman

LGTM!

allyhawkins added 18 commits August 26, 2025 15:52

set up parameters

111bad8

add scimilarity workflow

79bf6e3

add stub model

3615602

document module

683e00f

set seed

b015f65

temporarily remove other modules

243b748

specify python3

a78b2cf

move scimilarity stub

a32c9ef

use executable

833411d

Merge remote-tracking branch 'origin/main' into allyhawkins/170-scimi…

14f1eb7

…larity

reorganize stub model

c13bbe1

more reorg

c7910ee

try to use /bin/bash

5a7c1dc

comment out some more modules

aca2b1b

remove ./

30d1808

Revert "try to use /bin/bash"

1457c5d

This reverts commit 5a7c1dc.

test out activating the environment

c3829e8

Revert "test out activating the environment"

fd36ca0

This reverts commit c3829e8.

allyhawkins mentioned this pull request Aug 27, 2025

Make sure conda environment is used by default in SCimilarity image AlexsLemonade/OpenScPCA-analysis#1311

Merged

allyhawkins marked this pull request as ready for review August 28, 2025 14:33

Merge branch 'main' into allyhawkins/170-scimilarity

15e52d6

simplify output definition

fd0e690

allyhawkins requested a review from sjspielman August 28, 2025 14:37

sjspielman reviewed Aug 28, 2025

View reviewed changes

allyhawkins and others added 3 commits August 28, 2025 13:20

remove extra space

7c83d6f

Co-authored-by: Stephanie Spielman <stephanie.spielman@gmail.com>

add back other modules

3b3f889

add a todo for replacing with tagged links

f77faa4

allyhawkins requested a review from sjspielman August 28, 2025 18:27

sjspielman approved these changes Aug 28, 2025

View reviewed changes

Merge branch 'main' into allyhawkins/170-scimilarity

8294b49

allyhawkins merged commit 24ee52b into main Aug 28, 2025
3 checks passed

allyhawkins deleted the allyhawkins/170-scimilarity branch August 28, 2025 18:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add module for running SCimilarity#175

Add module for running SCimilarity#175
allyhawkins merged 24 commits intomainfrom
allyhawkins/170-scimilarity

allyhawkins commented Aug 27, 2025

Uh oh!

allyhawkins commented Aug 28, 2025

Uh oh!

sjspielman left a comment

Uh oh!

sjspielman Aug 28, 2025

Uh oh!

sjspielman Aug 28, 2025

Uh oh!

Uh oh!

sjspielman Aug 28, 2025

Uh oh!

sjspielman Aug 28, 2025

Uh oh!

allyhawkins Aug 28, 2025

Uh oh!

allyhawkins commented Aug 28, 2025 •

edited

Loading

Uh oh!

sjspielman left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

allyhawkins commented Aug 27, 2025

Uh oh!

allyhawkins commented Aug 28, 2025

Uh oh!

sjspielman left a comment

Choose a reason for hiding this comment

Uh oh!

sjspielman Aug 28, 2025

Choose a reason for hiding this comment

Uh oh!

sjspielman Aug 28, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sjspielman Aug 28, 2025

Choose a reason for hiding this comment

Uh oh!

sjspielman Aug 28, 2025

Choose a reason for hiding this comment

Uh oh!

allyhawkins Aug 28, 2025

Choose a reason for hiding this comment

Uh oh!

allyhawkins commented Aug 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sjspielman left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

allyhawkins commented Aug 28, 2025 •

edited

Loading