PyViscel

Interactive visualization and analysis of single-cell transcriptomics data — Python port of the VisCello R/Bioconductor package.

No R runtime required. Native AnnData/h5ad format throughout.

Installation

Option A — pip from GitHub (recommended)

pip install git+https://github.com/Gartner-Lab/pyviscel.git

To upgrade to the latest version later:

pip install --upgrade git+https://github.com/Gartner-Lab/pyviscel.git

Option B — conda environment (recommended for new machines)

Some dependencies (leidenalg, igraph, umap-learn) can be tricky to build from source. Using conda avoids compiler issues:

conda create -n pyviscel python=3.12
conda activate pyviscel
conda install -c conda-forge leidenalg python-igraph umap-learn
pip install git+https://github.com/Gartner-Lab/pyviscel.git

Option C — development install (for contributors)

git clone https://github.com/Gartner-Lab/pyviscel.git
cd pyviscel
pip install -e ".[dev]"

Runtime dependencies (auto-installed): anndata, dash, dash-bootstrap-components, plotly, pandas, numpy, scipy, scikit-learn, umap-learn, openTSNE, leidenalg, igraph, statsmodels, gseapy, matplotlib, seaborn.

Quick Start

Step 1 — Get your data into h5ad

Option A — Convert from an existing VisCello R object:

# In R — requires the original VisCello R package
library(VisCello)
cc <- readRDS("my_cello.rds")
viscello_to_h5ad(cc, "my_data.h5ad")

Option B — Build an AnnData directly in Python:

from pyviscel import validate_adata, save_adata
validate_adata(adata)          # checks required slots
save_adata(adata, "my_data.h5ad")

Step 2 — Explore programmatically

from pyviscel import load_adata, list_cellos, list_projections

adata = load_adata("my_data.h5ad")
print(list_cellos(adata))                       # ['All Cells', 'T cells', ...]
print(list_projections(adata, "All Cells"))     # ['UMAP_2D', 'UMAP_3D', ...]

Step 3 — Load data (optional: backed/memory-mapped mode)

For very large datasets (100k+ cells), open the file in backed mode to keep expression matrices on disk and reduce RAM usage:

from pyviscel import load_adata
adata = load_adata("my_data.h5ad", backed="r")   # read-only memory map

The file must already contain a norm_exprs layer — automatic layer aliasing is skipped in backed mode.

Step 4 — Launch the interactive web app

from pyviscel import run_app
run_app("my_data.h5ad", host="127.0.0.1", port=8050)

Or from the terminal:

pyviscel my_data.h5ad
pyviscel my_data.h5ad --host 0.0.0.0 --port 8050   # accessible on local network
pyviscel my_data.h5ad --no-validate                 # skip schema check for external h5ad files

Then open http://127.0.0.1:8050 in your browser.

Web App Features

Explorer Tab

Control	Description
Cello dropdown	Select a named cell subset
Projection dropdown	Select a 2-D or 3-D embedding (PCA, t-SNE, UMAP)
Color By dropdown	Color cells by metadata column, `Manual_Selection`, or gene expression
Point size / Alpha	Adjust marker size and transparency
Legend	Toggle full / abbreviated / no legend
Download view	Save the current scatter as a PNG image

Large cellos (many cells) are automatically spatially downsampled before rendering using a grid-based algorithm that preserves cluster structure; all cells are retained in the data.

3-D camera controls: When a 3-D projection is selected, elevation, azimuth, and zoom sliders appear alongside the scatter. The current angle is shown in the readout beneath the sliders. Dragging the scatter directly also updates the sliders.

Cell Annotation (manual selection)

2-D projections:

Switch the main scatter to lasso/box tool (toolbar icon)
Draw a selection on the plot — the status bar shows the cell count
Click Confirm — the selection is saved as Group 1, Group 2, etc. in a new Manual_Selection column
Repeat for more groups
Select Manual_Selection in Color By to see all groups

3-D projections:

Rotate the 3-D scatter to any viewing angle (or use the elevation/azimuth/zoom sliders)
Click Snapshot Current View — a 2-D projection of that camera angle appears below
Draw a lasso on the 2-D projection — the cell count updates
Click Confirm — same Group 1/2/3 workflow as 2-D
Click Clear to reset the projection panel

Cell Composition Tab

A sub-tab within the Explorer for cross-tabulating two metadata columns.

Select a Row variable and Column variable from adata.obs
Numeric columns are binned automatically (configurable bin count)
Results appear as an annotated heatmap (raw counts or row/column/total-normalised proportions) and a sortable data table
Download CSV exports the cross-tabulation matrix

Differential Expression Tab

Compares a selected group of cells against a background using:

Chi-square — fast, good for detecting marker genes
Mann-Whitney U — non-parametric, robust
sSeq — negative-binomial model (closest to edgeR/DESeq2 behaviour)

The DE panel shows:

Group 1 DEGs — genes upregulated in the selected group (log2FC > 0)
Group 2 DEGs — genes upregulated in the background group (log2FC < 0, displayed as positive fold-change)
A scatter plot coloured by group membership on the selected projection
A gene expression scatter for any gene you search in the DE results
A heatmap of the top significant genes

Results are sortable and downloadable as CSV.

Enrichment Tab

Full ORA and GSEA Prerank suite powered by the Enrichr API (gseapy).

Mode

ORA (Over-Representation Analysis) — tests which gene sets are enriched in the DE gene lists for Group 1 and Group 2 simultaneously; results shown side-by-side as dotplots and sortable tables.
GSEA Prerank — ranks all genes by signed log₂FC (Group 1 positive, Group 2 negative), runs GSEA Prerank on the full ranked list; mountain plots shown for top enriched terms.

Organisms supported Human (hsa), Mouse (mmu), Fly (dme), Zebrafish (dre), Yeast (sce), Worm (cel).

Gene set types

Type	Description	Availability
BP	GO Biological Process	All organisms
MF	GO Molecular Function	All organisms
CC	GO Cellular Component	All organisms
All GO	BP + MF + CC combined	All organisms
KEGG	KEGG Pathways	All organisms
WikiPathways	WikiPathways	All organisms
MSigDB Hallmark	MSigDB Hallmark gene sets	Human & Mouse only
Reactome	Reactome Pathways	Human & Mouse only
All	All of the above	Human & Mouse only

Mouse MSigDB/Reactome results are obtained by first converting mouse gene symbols to human orthologs via the mygene.info API, then querying Enrichr — no local file required.

Controls

Fast mode checkbox — runs GSEA with 100 permutations instead of 1000 for quick exploration
Run Enrichment / Run GSEA — results and any error messages appear immediately below
Download CSV — exports ORA results (both groups) or GSEA results as .csv

Modules

Module	Description
`pyviscel.io`	Load/save `.h5ad`, validate schema, list cellos and projections
`pyviscel.cello_class`	`Cello` and `CelloCollection` — named cell subsets with projections
`pyviscel.dim_reduction`	PCA, t-SNE, UMAP (stored in `adata.obsm`)
`pyviscel.clustering`	k-NN graph construction, Leiden / Louvain / density clustering
`pyviscel.differential_expression`	Chi-square, Mann-Whitney U, sSeq NB DE tests
`pyviscel.enrichment`	ORA and GSEA Prerank via gseapy (Enrichr); mouse→human ortholog conversion via mygene.info
`pyviscel.plotting`	Plotly scatter plots, expression plots, enrichment dotplots, GSEA mountain plots, crosstab heatmaps
`pyviscel.heatmap`	Annotated gene expression heatmap (log → z-score → cluster)
`pyviscel.ui_components`	Reusable Dash/DBC layout components
`pyviscel.app`	Full Dash application with Explorer, Annotation, and DE tabs
`pyviscel.convert`	Utilities for converting R VisCello objects to AnnData

Development

pip install -e ".[dev]"
pytest tests/ -q          # 412 tests

Tests cover all analysis modules and the Dash app layout/callbacks.

Known Limitations / Upcoming Work

3-D camera-angle selection: snapshot projection works; minor visual edge cases remain (being fixed)
dash_table.DataTable deprecation warning from Dash — no functional impact; migration to dash-ag-grid planned
Enrichr ORA uses the built-in Enrichr background, not a custom gene universe (Enrichr API limitation); use compute_go_offline() directly for custom-background ORA
GSEA Prerank with 1000 permutations can take several minutes; use Fast mode (100 permutations) for exploratory work
Mouse MSigDB/Reactome requires internet access (mygene.info ortholog lookup); offline runs will fail for these two types

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PyViscel

Installation

Option A — pip from GitHub (recommended)

Option B — conda environment (recommended for new machines)

Option C — development install (for contributors)

Quick Start

Step 1 — Get your data into h5ad

Step 2 — Explore programmatically

Step 3 — Load data (optional: backed/memory-mapped mode)

Step 4 — Launch the interactive web app

Web App Features

Explorer Tab

Cell Annotation (manual selection)

Cell Composition Tab

Differential Expression Tab

Enrichment Tab

Modules

Development

Known Limitations / Upcoming Work

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

PyViscel

Installation

Option A — pip from GitHub (recommended)

Option B — conda environment (recommended for new machines)

Option C — development install (for contributors)

Quick Start

Step 1 — Get your data into h5ad

Step 2 — Explore programmatically

Step 3 — Load data (optional: backed/memory-mapped mode)

Step 4 — Launch the interactive web app

Web App Features

Explorer Tab

Cell Annotation (manual selection)

Cell Composition Tab

Differential Expression Tab

Enrichment Tab

Modules

Development

Known Limitations / Upcoming Work