Steamboat

Steamboat is an interpretable machine learning framework leveraging a self-supervised, multi-head attention model that uniquely decomposes the gene expression of a cell into multiple key factors:

intrinsic cell programs,
neighboring cell communication, and
long-range interactions.

These pieces of information are used to generate cell embedding, cell network, and reconstructed gene expression.

System requirements

Hardware

Steamboat can run on a laptop, desktop, or server. The experiments were done on a desktop computer with a 6-core Ryzen 5 3600 CPU and an RTX 3080 GPU. A GPU can significantly reduce the time needed to train the models.

Operating system

Steamboat is python-based and run on all mainsteam operating systems. It has been tested on Windows 10 and Springdale Linux.

Software dependencies

Package	Tested with
Python	3.11.5
Torch	2.1.2 (w/ cuda 12.1)
Scanpy	1.9.6
Squidpy	1.5.0
Scipy	1.11.4
Numpy	1.26.2
Networkx	3.1
Matplotlib	3.8.0
Seaborn	0.13.2
Scikit-learn	1.2.2

Installation

We recommend using Miniconda to create an virtual environment.

conda create -n steamboat
conda activate steamboat

Please follow the official guide to install the appropriate Pytorch version for you system and hardware. Then, please install the required packages with pip install -r requirements.txt.

Steamboat can be imported directly after adding its directory to the path.

git clone https://github.com/ma-compbio/Steamboat

import sys
sys.path.append("/path/of/the/cloned/repository")

Depending on your network, installation may take 10 to 30 minutes.

Basic workflow

import steamboat as sf # "sf" = "Steamboat Factorization"
import steamboat.tools

First, make a list (adatas) of one or more AnnData objects, and preprocess them.

adatas = sf.prep_adatas(adatas, log_norm=True)
dataset = sf.make_dataset(adatas)

Create a Steamboat model and fit it to the data.

model = sf.Steamboat(short_features, n_heads=10, n_scales=3)
model = model.to("cuda") # if you GPU acceleration is supported.
model.fit(dataset)

After training, you can check the trained metagenes.

sf.tools.plot_all_transforms(model, top=1)

For clustering and segmentation, run the following lines. Change the resolution to your liking.

sf.tools.neighbors(adata)
sf.tools.leiden(adata, resolution=0.1)
sf.tools.segment(adata, resolution=0.5)

Demos

A few examples in Jupyter notebook are included in the examples folder:

The simulation demo takes about five minutes to run. The mouse brain data takes one hour to train. Other demos take about ten minutes each.

Data used in these examples are available in Google Drive.

Documentation

For the full API and real data examples, please visit our documentation.

Name		Name	Last commit message	Last commit date
Latest commit History 88 Commits
docs		docs
examples		examples
steamboat		steamboat
.gitignore		.gitignore
.readthedocs.yaml		.readthedocs.yaml
LICENSE		LICENSE
pyproject.toml		pyproject.toml
readme.md		readme.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Steamboat

System requirements

Hardware

Operating system

Software dependencies

Installation

Basic workflow

Demos

Documentation

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

ma-compbio/Steamboat

Folders and files

Latest commit

History

Repository files navigation

Steamboat

System requirements

Hardware

Operating system

Software dependencies

Installation

Basic workflow

Demos

Documentation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages