Skip to content

Size calculation in sc.pl.embedding should ignore NaNs #3770

@MLubetzki

Description

@MLubetzki

What kind of feature would you like to request?

Additional function parameters / changed functionality / changed defaults?

Please describe your wishes

Hi everyone!

When I use sc.pl.embedding with some basis and this basis has NaNs for some cells, it simply omits those cells in the plot. This is, because the basis gets passed to matplotlib.pyplot.scatter, which has this as its default behavior.
I like that a lot, because it allows me to store embeddings of only a subset of cells (padded by NaNs) in an anndata object.
However, when calculating the size for the dots in the scatterplot, the number of cells in the anndata is used instead of the number of dots that's actually plotted, leading to very small dots.

You can reproduce this as follows:

import scanpy as sc
import numpy as np

m1 = np.random.randn(5,10)
m2 = np.full(shape=(5,10), fill_value=np.nan)
m = np.concatenate([m1, m2], axis=0)

adata = sc.AnnData(np.random.randn(10,10))
adata.obsm['X_umap'] = m

sc.pl.umap(adata[~np.isnan(adata.obsm['X_umap'].sum(axis=1))])
sc.pl.umap(adata)

In both cases, 5 cells are plotted, but the dots are smaller in the second UMAP.

I'll open a PR with a suggestion to improve this in a second.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Triage 🩺This issue needs to be triaged by a maintainer

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions