|
1 | | -# Seurat-Integrate |
2 | | -R package gathering a set of wrappers to apply various integration methods to Seurat objects and rate integration obtained with such methods |
| 1 | +# SeuratIntegrate <!-- omit in toc --> |
| 2 | +R package expanding integrative analysis capabilities of Seurat by providing seamless access to popular integration methods. It also implements an integration benchmarking toolkit that gathers well-established performance metrics to help select the most appropriate integration. |
| 3 | + |
| 4 | +Examples, documentation, memos, etc. are available on the SeuratIntegrate's [website](https://cbib.github.io/Seurat-Integrate/). |
| 5 | + |
| 6 | +SeuratIntegrate provides support to R- and Python-based integration methods. The table below summarizes which methods are compatible with SeuratIntegrate: |
| 7 | + |
| 8 | +<table id="table1" style="margin: 0px auto;"><thead> |
| 9 | + <caption>Table 1: Supported integration methods</caption> |
| 10 | + <tr> |
| 11 | + <th></th> |
| 12 | + <th>Package</th> |
| 13 | + <th>Method</th> |
| 14 | + <th>Function</th> |
| 15 | + </tr></thead> |
| 16 | +<tbody> |
| 17 | + <tr> |
| 18 | + <td rowspan="6">R</td> |
| 19 | + <td rowspan="3">SeuratIntegrate</td> |
| 20 | + <td><a href="https://www.bioconductor.org/packages/release/bioc/html/sva.html" target="_blank" rel="noopener noreferrer">ComBat</a></td> |
| 21 | + <td><code>CombatIntegration()</code></td> |
| 22 | + </tr> |
| 23 | + <tr> |
| 24 | + <td><a href="https://cran.r-project.org/web/packages/harmony/index.html" target="_blank" rel="noopener noreferrer">Harmony</a></td> |
| 25 | + <td><code>HarmonyIntegration()</code></td> |
| 26 | + </tr> |
| 27 | + <tr> |
| 28 | + <td><a href="https://www.bioconductor.org/packages/release/bioc/html/batchelor.html" target="_blank" rel="noopener noreferrer">MNN</a></td> |
| 29 | + <td><code>MNNIntegration()</code></td> |
| 30 | + </tr> |
| 31 | + <tr> |
| 32 | + <td rowspan="2">Seurat</td> |
| 33 | + <td><a href="https://cran.r-project.org/web/packages/Seurat/index.html" target="_blank" rel="noopener noreferrer">CCA</a></td> |
| 34 | + <td><code>CCAIntegration()</code></td> |
| 35 | + </tr> |
| 36 | + <tr> |
| 37 | + <td><a href="https://cran.r-project.org/web/packages/Seurat/index.html" target="_blank" rel="noopener noreferrer">RPCA</a></td> |
| 38 | + <td><code>RPCAIntegration()</code></td> |
| 39 | + </tr> |
| 40 | + <tr> |
| 41 | + <td>SeuratWrappers</td> |
| 42 | + <td><a href="https://github.com/satijalab/seurat-wrappers" target="_blank" rel="noopener noreferrer">FastMNN</a><br>(<a href="https://bioconductor.org/packages/release/bioc/html/batchelor.html" target="_blank" rel="noopener noreferrer">batchelor</a>)</td> |
| 43 | + <td><code>FastMNNIntegration()</code></td> |
| 44 | + </tr> |
| 45 | + <tr> |
| 46 | + <td rowspan="5">Python</td> |
| 47 | + <td rowspan="5">SeuratIntegrate</td> |
| 48 | + <td><a href="https://github.com/Teichlab/bbknn" target="_blank" rel="noopener noreferrer">BBKNN</a></td> |
| 49 | + <td><code>bbknnIntegration()</code></td> |
| 50 | + </tr> |
| 51 | + <tr> |
| 52 | + <td><a href="https://github.com/scverse/scvi-tools" target="_blank" rel="noopener noreferrer">scVI</a></td> |
| 53 | + <td><code>scVIIntegration()</code></td> |
| 54 | + </tr> |
| 55 | + <tr> |
| 56 | + <td><a href="https://github.com/scverse/scvi-tools" target="_blank" rel="noopener noreferrer">scANVI</a></td> |
| 57 | + <td><code>scANVIIntegration()</code></td> |
| 58 | + </tr> |
| 59 | + <tr> |
| 60 | + <td><a href="https://github.com/brianhie/scanorama" target="_blank" rel="noopener noreferrer">Scanorama</a></td> |
| 61 | + <td><code>ScanoramaIntegration()</code></td> |
| 62 | + </tr> |
| 63 | + <tr> |
| 64 | + <td><a href="https://github.com/theislab/scarches" target="_blank" rel="noopener noreferrer">trVAE</a></td> |
| 65 | + <td><code>trVAEIntegration()</code></td> |
| 66 | + </tr> |
| 67 | +</tbody></table> |
| 68 | + |
| 69 | + |
| 70 | +- [Installation](#installation) |
| 71 | +- [Preparations](#preparations) |
| 72 | + - [Setup Python environments](#setup-python-environments) |
| 73 | + - [Setup a `SeuratObject`](#setup-a-seuratobject) |
| 74 | + - [Facultative dependencies](#facultative-dependencies) |
| 75 | +- [SeuratIntegrate usage](#seuratintegrate-usage) |
| 76 | + - [Integrate datasets](#integrate-datasets) |
| 77 | + - [Post-process integration outputs](#post-process-integration-outputs) |
| 78 | + - [Compare integrations](#compare-integrations) |
| 79 | +- [Getting help and advice](#getting-help-and-advice) |
| 80 | +- [Citing](#citing) |
| 81 | + |
3 | 82 |
|
4 | 83 | ## Installation |
5 | | -### 1) Remotely from github with R |
| 84 | + |
6 | 85 | Install SeuratIntegrate from github directly: |
7 | 86 | ```R |
8 | | -install.packages(c("remotes", "BiocManager")) |
9 | | -remotes::install_github("cbib/Seurat-Integrate", dependancies = NA, repos = BiocManager::repositories()) |
| 87 | +if (!require("BiocManager", quietly = TRUE)) |
| 88 | + install.packages("BiocManager") |
| 89 | +if (!require("remotes", quietly = TRUE)) |
| 90 | + install.packages("remotes") |
| 91 | +remotes::install_github("cbib/Seurat-Integrate", dependencies = NA, repos = BiocManager::repositories()) |
| 92 | +``` |
| 93 | + |
| 94 | +## Preparations |
| 95 | + |
| 96 | +### Setup Python environments |
| 97 | + |
| 98 | +To use Python methods, run the following commands (once) to set up the necessary conda environments: |
| 99 | + |
| 100 | +```R |
| 101 | +library(SeuratIntegrate) |
| 102 | + |
| 103 | +# Create envrionments |
| 104 | +UpdateEnvCache("bbknn") |
| 105 | +UpdateEnvCache("scvi") # also scANVI |
| 106 | +UpdateEnvCache("scanorama") |
| 107 | +UpdateEnvCache("trvae") |
| 108 | + |
| 109 | +# Show cached environments |
| 110 | +getCache() |
| 111 | +``` |
| 112 | + |
| 113 | +Environments are persistently stored in the cache and the **`UpdateEnvCache()` commands should not need to be executed again**. |
| 114 | + |
| 115 | +While these environments should work well in most cases, conda's dependencies occasionally encounter conflicts. Manual adjustment might be needed. You may find helpful information in [this vignette](https://cbib.github.io/Seurat-Integrate/articles/setup_and_tips.html#troubleshouting-with-conda). |
| 116 | + |
| 117 | +### Setup a `SeuratObject` |
| 118 | + |
| 119 | +To integrate data with SeuratIntegrate, you need to preprocess your `SeuratObject` until you obtain at least a PCA. Importantly, the `SeuratObject` must have its **layers split by batches**. |
| 120 | + |
| 121 | +<details> |
| 122 | + <summary>Not familiar with Seurat?</summary> |
| 123 | + |
| 124 | + Have a look at Seurat's [website](https://satijalab.org/seurat/articles/get_started_v5_new), especially the tutorials covering [SCTransform](https://satijalab.org/seurat/articles/sctransform_vignette) and [integrative analyses](https://satijalab.org/seurat/articles/seurat5_integration). |
| 125 | +</details> |
| 126 | +<br> |
| 127 | + |
| 128 | +To fully benefit from the benchmarking toolkit, you'll need **cell-type annotations of sufficient quality** to be considered suitable as ground truth. |
| 129 | + |
| 130 | +### Facultative dependencies |
| 131 | + |
| 132 | +The benchmarking toolkit can benefit from additional dependencies: |
| 133 | +```R |
| 134 | +# required to test for k-nearest neighbour batch effects |
| 135 | +remotes::install_github('theislab/kBET') |
| 136 | + |
| 137 | +# fast distance computation |
| 138 | +install.packages('distances') |
| 139 | + |
| 140 | +# faster Local Inverse Simpson’s Index computation |
| 141 | +remotes::install_github('immunogenomics/lisi') |
| 142 | +``` |
| 143 | + |
| 144 | +## SeuratIntegrate usage |
| 145 | +### Integrate datasets |
| 146 | + |
| 147 | +When your `SeuratObject` is ready, you can launch multiple integrations (from [Table 1](#table1)) with a single command. `DoIntegrate()` provides a flexible interface to customise integration-specific parameters and to control over associated data and features. |
| 148 | + |
| 149 | +```R |
| 150 | +seu <- DoIntegrate(seu, |
| 151 | + # ... integrations |
| 152 | + CombatIntegration(layers = "data"), |
| 153 | + HarmonyIntegration(orig = "pca", dims = 1:30), |
| 154 | + ScanoramaIntegration(ncores = 4L, layers = "data"), |
| 155 | + scVIIntegration(layers = "counts", features = Features(seu)), |
| 156 | + # ... |
| 157 | + use.hvg = TRUE, # `VariableFeatures()` |
| 158 | + use.future = c(FALSE, FALSE, TRUE, TRUE) |
| 159 | +) |
10 | 160 | ``` |
11 | | -If you prefer, or if it does not work, you can use the alternative way described below. |
12 | | -### 2) From a local copy using the remotes R package |
13 | | -The installation process encompasses two steps, namely: |
14 | | - - Clone or download the repository or the [latest release](https://github.com/cbib/Seurat-Integrate/releases/tag/0.3.1) |
15 | | - - Install the package |
16 | | - |
17 | | -First off, go to `some/path/to/download/the/repository`. Then, clone or download the repository or a release. |
18 | | -```shell |
19 | | -# Example: clone |
20 | | -git clone [email protected]:cbib/Seurat-Integrate.git |
| 161 | + |
| 162 | +In this example, all integration methods will use the variable features as input, with the exception of `scVIIntegration()` which is set to use all features (`features = Features(seu)`). `CombatIntegration()` will correct the normalised counts (`layers = "data"`), while `scVIIntegration()` will train on raw counts (`layers = "counts"`). |
| 163 | + |
| 164 | +`use.future` must be **`TRUE` for Python methods**, and `FALSE` for R methods (see [Table 1](#table1)). |
| 165 | + |
| 166 | +### Post-process integration outputs |
| 167 | + |
| 168 | +Integration methods produce one or several outputs. Because they can be of different types, the following table indicates the post-processing steps to generate a UMAP. |
| 169 | + |
| 170 | +<a id="table2"></a> |
| 171 | +<caption>Table 2: Output types and processing</caption> |
| 172 | + |
| 173 | +| **Output type** | **Object name** | **Processing** | |
| 174 | +|-----------------------|:---------------:|-------------------------------------------:| |
| 175 | +| Corrected counts | `Assay` | `ScaleData()` ➔ `RunPCA()` ➔ `RunUMAP()` | |
| 176 | +| Dimensional reduction | `DimReduc` | `RunUMAP()` | |
| 177 | +| KNN graph | `Graph` | `RunUMAP(umap.method = "umap-learn")` | |
| 178 | + |
| 179 | +Output types are summarized for each method in the [Memo vignette](https://cbib.github.io/Seurat-Integrate/articles/memo_integration.html) about integration methods |
| 180 | + |
| 181 | + |
| 182 | +### Compare integrations |
| 183 | + |
| 184 | +SeuratIntegrate incorporates 11 scoring metrics: 6 quantify the degree of batch mixing (*batch correction*), while 5 assess the preservation of biological differences (*bio-conservation*) based on ground truth cell type labels. |
| 185 | + |
| 186 | +To score your integrations, you must process their outputs as in the **Processing** column of [Table 2](#table2). You'll also need to get a graph by running `FindNeighbors(return.neighbor = TRUE)` (this [vignette](https://cbib.github.io/Seurat-Integrate/articles/SeuratIntegrate.html#post-processing) provides further guidance). |
| 187 | + |
| 188 | +Then, scores can be obtained using the function `Score[score_name]()`, or directly saved in the Seurat object using the `AddScore[score_name]()` as follows: |
| 189 | + |
| 190 | +```R |
| 191 | +# save the score in a variable |
| 192 | +rpca_score <- ScoreRegressPC(seu, reduction = "[dimension_reduction]") #e.g. "pca" |
| 193 | + |
| 194 | +# or save the score in the Seurat object |
| 195 | +seu <- AddScoreRegressPC(seu, integration = "[name_of_integration]", reduction = "[dimension_reduction]") |
21 | 196 | ``` |
22 | | -Depending on what you chose to download, you can either obtain the source code in a `Seurat-Integrate/` folder or a compressed version ending with `.targ.gz`. Regardless, you can use the command below: |
| 197 | +It is worth noting that the **unintegrated version must also be scored** to perform a complete comparative analysis. When scores have been computed, they can be used to compare the integration outputs. See this [vignette](https://cbib.github.io/Seurat-Integrate/articles/memo_score.html) for a complete overview of available scores. |
| 198 | + |
| 199 | +The advantage of the `AddScore` over the `Score` functions is that they facilitate score scaling and plotting: |
| 200 | + |
23 | 201 | ```R |
24 | | -install.packages(c("remotes", "BiocManager")) |
25 | | -path <- 'Seurat-Integrate/' # or SeuratIntegrate_X.X.X.tar.gz (X.X.X being the version) |
26 | | -remotes::install_local(path = path, dependencies = NA, repos = BiocManager::repositories()) |
| 202 | +# scale |
| 203 | +seu <- ScaleScores(seu) |
| 204 | + |
| 205 | +# plot |
| 206 | +PlotScores(seu) |
27 | 207 | ``` |
28 | | -Done ! Open a R session and try it out ! |
| 208 | + |
| 209 | +## Getting help and advice |
| 210 | + |
| 211 | +Examples, documentation, memos, etc. are available on SeuratIntegrate's [website](https://cbib.github.io/Seurat-Integrate/). |
| 212 | + |
| 213 | +If you encounter a bug, please create an [issue on GitHub](https://github.com/cbib/Seurat-Integrate/issues). Likewise if you have a specific comment or question not covered on the website. |
| 214 | + |
| 215 | +## Citing |
| 216 | + |
| 217 | +If you find SeuratIntegrate useful, please consider citing: |
| 218 | + |
| 219 | +> Specque, F., Barré, A., Nikolski, M., & Chalopin, D. (2025). SeuratIntegrate: an R package to facilitate the use of integration methods with Seurat. *Bioinformatics*. doi: [10.1093/bioinformatics/btaf358](https://doi.org/10.1093/bioinformatics/btaf358) |
0 commit comments