This guide shows how to use Chorus with the new modular environment system that isolates each oracle's dependencies.
# Clone the repository
git clone https://github.com/pinellolab/chorus.git
cd chorus
# Create and activate the minimal base environment
mamba env create -f environment.yml
mamba activate chorus
# Install Chorus
pip install -e .# List all available oracle environments
chorus list
# Output:
# Available oracle environments:
# - chorus-enformer: TensorFlow-based environment for Enformer oracle
# - chorus-borzoi: PyTorch-based environment for Borzoi oracle
# - chorus-chrombpnet: TensorFlow-based environment for ChromBPNet oracle
# - chorus-sei: PyTorch-based environment for Sei oracle# Set up all environments (this may take a while)
chorus setup --all
# Or set up specific oracles only
chorus setup --oracle enformer
chorus setup --oracle sei
# Check setup status
chorus validateChorus includes utilities for automatically downloading and managing reference genomes:
# List available genomes
chorus genome list
# Download a reference genome
chorus genome download hg38
# Get genome information
chorus genome info hg38
# Remove a genome
chorus genome remove hg38In Python, genomes are automatically downloaded when needed:
from chorus.utils import get_genome
# Get genome path (downloads automatically if not present)
genome_path = get_genome('hg38') # Default genome
# Use with oracles
oracle = chorus.create_oracle('enformer',
use_environment=True,
reference_fasta=str(genome_path))import chorus
# Method 1: Automatic environment management
oracle = chorus.create_oracle('enformer', use_environment=True)
oracle.load_pretrained_model() # Runs in isolated environment
# Get reference genome (auto-downloads if needed)
from chorus.utils import get_genome
genome_path = get_genome('hg38')
# Make predictions
results = oracle.predict_region_replacement(
genomic_region="chr1:1000000-1200000",
seq="",
assay_ids=["DNase:K562"],
genome=str(genome_path)
)
# Method 2: Manual environment specification
from chorus.oracles.enformer_env import EnformerOracleEnv
oracle = EnformerOracleEnv(use_environment=True)# Check environment health
chorus health
# Remove an environment
chorus remove --oracle enformer
# Run tests in isolated environments
chorus test --oracle enformer- No Dependency Conflicts: TensorFlow and PyTorch models can coexist
- Smaller Base Install: Only install what you need
- Easy Updates: Update individual oracle environments without affecting others
- Clean Uninstall: Remove oracle environments individually
# If you get "Environment chorus-enformer not found"
chorus setup --oracle enformerThe first time you use an oracle with use_environment=True, it may be slower as it activates the conda environment.
Each oracle runs in a separate Python process when using isolated environments, which may use more memory.
If you want to run without isolation (requires manual dependency management):
oracle = chorus.create_oracle('enformer', use_environment=False)- Base (chorus): ~500MB, core functionality only
- Enformer (chorus-enformer): ~2GB, includes TensorFlow
- Borzoi (chorus-borzoi): ~2.5GB, includes PyTorch
- ChromBPNet (chorus-chrombpnet): ~2GB, includes TensorFlow
- Sei (chorus-sei): ~2.5GB, includes PyTorch
- Try the example notebooks with the modular system
- Set up only the oracles you need
- Use
chorus healthto monitor environment status