Significant variation in GRNBoost2 results with minor cell subsampling (removing one cell) in pySCENIC

When running the GRN step in pySCENIC, I observed substantial differences in the output adjacencies.csv after removing just one cell from the expression matrix. Specifically:

Using the ​​full expression matrix​​ (e.g., thousands of cells) vs. a matrix ​​missing one cell​​ yields only ​​56.21% overlap in TF-target pairs​​.
This level of variability seems unexpectedly high for a dataset of this scale.

I wonder if it's something wrong with my code

Code:
```bash
if [ ! -f grn.SUCCESS ]; then
    arboreto_with_multiprocessing.py \
      $count_loom \
      $tf_list \
      --num_workers 16 \
      --output adjacencies.csv \
      --method grnboost2 \
      --sparse \
      --seed 1 \
    && touch grn.SUCCESS
fi

if [ ! -f grn.SUCCESS ]; then echo "grn error"; exit 1; fi
```



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Significant variation in GRNBoost2 results with minor cell subsampling (removing one cell) in pySCENIC #623

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Significant variation in GRNBoost2 results with minor cell subsampling (removing one cell) in pySCENIC #623

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions