Skip to content

SeuratDisk Should Be Published on CRAN or Bioconductor #194

@nick-youngblut

Description

@nick-youngblut

Summary

SeuratDisk has been an essential tool for the single-cell community for years, but its absence from CRAN/Bioconductor creates significant deployment and reproducibility challenges. The requirement to install from GitHub using remotes::install_github() leads to fragile builds that break unpredictably across different environments.

The Problem with remotes::install_github()

When installing SeuratDisk from GitHub, remotes::install_github():

  1. Compiles from source - Even when binary packages are available via package managers like r2u
  2. Recompiles dependencies - Upgrades and recompiles dependencies (RANN, sp, etc.) that are already installed as system binaries
  3. Creates library version conflicts - Compiled packages go into /usr/local/lib/R/site-library/, overriding system packages in /usr/lib/R/site-library/
  4. Requires newer system libraries - Freshly compiled packages may require GLIBC_2.38 or GLIBCXX_3.4.32 that aren't available in stable Linux distributions (e.g., Ubuntu 22.04 LTS ships with GLIBC 2.35)

Real-World Impact

We encountered this building a Docker image based on rocker/r2u:jammy:

Error: package or namespace load failed for 'SeuratDisk' in dyn.load(file, DLLpath = DLLpath, ...):
 unable to load shared object '/usr/local/lib/R/site-library/RANN/libs/RANN.so':
  /lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.32' not found
  
Error: package or namespace load failed for 'sp' in dyn.load(file, DLLpath = DLLpath, ...):
 unable to load shared object '/usr/local/lib/R/site-library/sp/libs/sp.so':
  /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.38' not found

Even though RANN, sp, Seurat, and all dependencies were installed as pre-compiled binaries from r2u, remotes::install_github() silently recompiled them during SeuratDisk installation, creating binaries incompatible with the host system.

Why This Matters

  1. Reproducibility Crisis - Scientific workflows that worked yesterday can break today when dependencies update
  2. Docker Build Fragility - Container builds fail unpredictably based on the compile environment
  3. Maintenance Burden - Every downstream project must implement workarounds (dependencies=FALSE, pinning versions, etc.)
  4. Onboarding Friction - New users struggle with installation issues that have nothing to do with SeuratDisk itself

CRAN/Bioconductor Benefits

Publishing on CRAN or Bioconductor would provide:

  • Binary packages for Windows, macOS, and Linux (via r2u, RSPM, etc.)
  • Stable releases with version tracking
  • Dependency resolution that respects already-installed packages
  • Automated testing across R versions and platforms
  • Trust and discoverability in the R community

Current Workarounds

Projects currently resort to:

  • remotes::install_github(..., dependencies=FALSE) - fragile, requires all deps pre-installed
  • Pinning to specific commits - breaks when dependencies evolve
  • Building from source with exact system library versions
  • Avoiding SeuratDisk entirely and using alternative formats

Request

Please publish SeuratDisk on CRAN or Bioconductor.

According to the SeuratDisk documentation, the package has been stable for years and is widely used. Given that Seurat itself is on CRAN and SeuratObject is on CRAN, SeuratDisk should join them.

If there are blockers (licensing, maintainer bandwidth, dependency issues), please document them so the community can help resolve them.

The single-cell community would greatly benefit from having SeuratDisk available through standard R package distribution channels.

References


Environment:

  • R version: 4.5.1
  • Platform: Ubuntu 22.04 (Jammy) in Docker
  • SeuratDisk version: 0.0.0.9021 (from master)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions