|
| 1 | +--- |
| 2 | +title: "Single-cell deconvolution" |
| 3 | +output: rmarkdown::html_vignette |
| 4 | +vignette: > |
| 5 | + %\VignetteIndexEntry{Single-cell deconvolution} |
| 6 | + %\VignetteEngine{knitr::rmarkdown} |
| 7 | + %\VignetteEncoding{UTF-8} |
| 8 | +bibliography: references.bib |
| 9 | +link-citations: yes |
| 10 | +colorlinks: yes |
| 11 | +biblio-style: apalike |
| 12 | +--- |
| 13 | + |
| 14 | +```{r, include = FALSE} |
| 15 | +knitr::opts_chunk$set( |
| 16 | + collapse = TRUE, |
| 17 | + comment = "#>" |
| 18 | +) |
| 19 | +``` |
| 20 | + |
| 21 | +```{r setup} |
| 22 | +library(multideconv) |
| 23 | +``` |
| 24 | + |
| 25 | +## **Single-cell metacell construction** |
| 26 | + |
| 27 | +If single-cell data is available, we recommend generating metacells to reduce computation time and prevent session crashes. Deconvolution methods that rely on single-cell data can be computationally intensive, especially with large matrices. We suggest using a maximum of 20k cells; if your object exceeds this size, creating metacells is strongly advised. However, if your computational resources are sufficient to handle the full single-cell dataset, you may skip this step. |
| 28 | + |
| 29 | +We adapted functions from the R package hdWGCNA (@morabito2023hdwgcna; @langfelder2008wgcna) for the construction of metacells using the KNN algorithm. |
| 30 | + |
| 31 | +- **sc_object**: Normalized gene expression matrix with genes as rows and cells as columns |
| 32 | + |
| 33 | +- **labels_column**: Vector of cell annotations |
| 34 | + |
| 35 | +- **samples_column**: Vector of sample IDs for each cell |
| 36 | + |
| 37 | +- **exclude_cells**: Vector specifying which cell types to ignore during metacell construction (default is NULL) |
| 38 | + |
| 39 | +- **min_cells**: Minimum number of cells required to construct metacells in a group |
| 40 | + |
| 41 | +- **k**: Number of nearest neighbors used for the KNN algorithm |
| 42 | + |
| 43 | +- **max_shared**: Maximum number of cells shared between two metacells |
| 44 | + |
| 45 | +- **n_workers**: Number of cores to use for parallelizing metacell construction |
| 46 | + |
| 47 | +- **min_meta**: Minimum number of metacells required for a cell type to be retained |
| 48 | + |
| 49 | +Because of space limitations, we have not included a complete single-cell object in this tutorial. However, users are expected to provide their own single-cell data and supply it to the `sc_object` parameter in the function call. |
| 50 | +```{r, eval=FALSE} |
| 51 | +metacells = create_metacells(sc_object, |
| 52 | + labels_column = cell_labels, |
| 53 | + samples_column = sample_labels, |
| 54 | + exclude_cells = NULL, |
| 55 | + min_cells = 50, |
| 56 | + k = 15, |
| 57 | + max_shared = 15, |
| 58 | + n_workers = 4, |
| 59 | + min_meta = 10) |
| 60 | +``` |
| 61 | + |
| 62 | +## **Second-generation deconvolution methods** |
| 63 | + |
| 64 | +Once the single-cell data is prepared, users can supplement the default deconvolution methods with second-generation approaches such as `AutogeneS`, `BayesPrism`, `Bisque`, `CPM`, `MuSic`, and `SCDC`. These methods learn cell-type signatures directly from annotated single-cell RNA-seq data, rather than relying on predefined static signatures (@Dietrich2024.06.10.598226), to deconvolve bulk RNA-seq profiles. |
| 65 | + |
| 66 | +- **sc_deconv**: Boolean indicating whether to run second-generation methods |
| 67 | + |
| 68 | +- **sc_matrix**: Normalized single-cell gene expression matrix |
| 69 | + |
| 70 | +- **sc_metadata**: Dataframe containing single-cell metadata |
| 71 | + |
| 72 | +- **cell_annotations**: Vector of cell type labels |
| 73 | + |
| 74 | +- **cell_samples**: Vector of sample IDs |
| 75 | + |
| 76 | +- **name_sc_signature**: Name to assign to the resulting signature |
| 77 | + |
| 78 | +```{r} |
| 79 | +metacell_obj = multideconv::metacells_data |
| 80 | +metacell_metadata = multideconv::metacells_metadata |
| 81 | +head(metacell_obj[1:5,1:5]) |
| 82 | +head(metacell_metadata) |
| 83 | +``` |
| 84 | + |
| 85 | +This function computes cell type deconvolution using the six default methods (`quanTIseq`, `DeconRNASeq`, `EpiDISH`, `DWLS`, `MOMF`) and `CIBERSORTx` (if credentials are provided), along with second-generation deconvolution approaches. The output includes all combinations of methods and signatures. |
| 86 | + |
| 87 | +```{r, eval=FALSE} |
| 88 | +deconv = compute.deconvolution(raw.counts = bulk, |
| 89 | + normalized = TRUE, |
| 90 | + return = TRUE, |
| 91 | + methods = c("Quantiseq", "Epidish", "DeconRNASeq"), |
| 92 | + file_name = "Tutorial", |
| 93 | + sc_deconv = TRUE, |
| 94 | + sc_matrix = metacell_obj, |
| 95 | + sc_metadata = metacell_metadata, |
| 96 | + methods_sc = c("Autogenes", "BayesPrism", |
| 97 | + "Bisque", "CPM", "MuSic", "SCDC"), |
| 98 | + cell_label = "annotated_ct", |
| 99 | + sample_label = "sample", |
| 100 | + name_sc_signature = "Test") |
| 101 | +``` |
| 102 | + |
| 103 | +To run only the second-generation deconvolution methods based on single-cell data, without using any static cell-type signatures, use the following: |
| 104 | + |
| 105 | +```{r, eval=FALSE} |
| 106 | +deconv_sc = compute_sc_deconvolution_methods(raw_counts = bulk, |
| 107 | + normalized = TRUE, |
| 108 | + methods_sc = c("Autogenes", "BayesPrism", |
| 109 | + "Bisque", "CPM", "MuSic", "SCDC"), |
| 110 | + sc_object = metacell_obj, |
| 111 | + sc_metadata = metacell_metadata, |
| 112 | + cell_annotations = "annotated_ct", |
| 113 | + samples_ids = "sample", |
| 114 | + name_object = "Test", |
| 115 | + n_cores = 2, |
| 116 | + return = TRUE, |
| 117 | + file_name = "Tutorial") |
| 118 | +``` |
0 commit comments