-
Notifications
You must be signed in to change notification settings - Fork 54
Open
Description
If you train a model with the with_mean kwarg set to False and then try to annotate an h5ad file stored as a sparse matrix, you get the following failure
Traceback (most recent call last):
File "/Users/scott.daniel/KnowledgeEngineering/garage/celltypist_error/show_error.py", line 72, in <module>
celltypist.annotate(
File "/Users/scott.daniel/miniconda3/envs/celltypist/lib/python3.12/site-packages/celltypist/annotate.py", line 85, in annotate
predictions = clf.celltype(mode = mode, p_thres = p_thres)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/scott.daniel/miniconda3/envs/celltypist/lib/python3.12/site-packages/celltypist/classifier.py", line 374, in celltype
self.indata[self.indata > 10] = 10
~~~~~~~~~~~^^^^^^^^^^^^^^^^^^
TypeError: 'coo_matrix' object does not support item assignment
The failure does not occur if you try to annotate an h5ad file saved as a dense array.
The code below should recreate the bug. I am running version 1.7.1 of celltypist and version 0.12.2 of anndata.
import celltypist
import subprocess
import pathlib
# download a model (otherwise, celltypist will try to download
# ALL available models, even if you are using your own model)
celltypist.models.download_models(
model='Immune_All_Low.pkl'
)
training_data = "demo_2000_cells.h5ad"
test_data = "demo_400_cells.h5ad"
# download example data
if not pathlib.Path(training_data).exists():
p = subprocess.Popen(
["wget",
"https://celltypist.cog.sanger.ac.uk/Notebook_demo_data/demo_2000_cells.h5ad"]
)
p.wait()
if not pathlib.Path(test_data).exists():
p = subprocess.Popen(
["wget",
"https://celltypist.cog.sanger.ac.uk/Notebook_demo_data/demo_400_cells.h5ad"]
)
p.wait()
assert pathlib.Path(training_data).is_file()
assert pathlib.Path(test_data).is_file()
print("=======TRAINING with with_mean=True; SHOULD WORK")
model_path = "with_mean_true.pkl"
model = celltypist.train(
training_data,
labels='cell_type',
n_jobs=4,
feature_selection=True,
use_SGD=True,
mini_batch=True,
with_mean=True
)
model.write(model_path)
print("=======ANNOTATING")
celltypist.annotate(
test_data,
model=model_path
)
print("=======SUCCESS")
print("=======TRAINING with with_mean=False; WILL FAIL")
model_path = "with_mean_false.pkl"
model = celltypist.train(
training_data,
labels='cell_type',
n_jobs=4,
feature_selection=True,
use_SGD=True,
mini_batch=True,
with_mean=False
)
model.write(model_path)
print("=======ANNOTATING")
celltypist.annotate(
test_data,
model=model_path
)
print("SUCCESS")
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels