VeraPancaldiLab
diff --git a/‎_pkgdown.yml‎
Lines changed: 28 additions & 0 deletions b/‎_pkgdown.yml‎
Lines changed: 28 additions & 0 deletions
diff --git a/‎vignettes/a1_deconvolution.Rmd‎
Lines changed: 78 additions & 0 deletions b/‎vignettes/a1_deconvolution.Rmd‎
Lines changed: 78 additions & 0 deletions
diff --git a/‎vignettes/a2_single_cell.Rmd‎
Lines changed: 118 additions & 0 deletions b/‎vignettes/a2_single_cell.Rmd‎
Lines changed: 118 additions & 0 deletions
diff --git a/‎vignettes/a3_benchmark.Rmd‎
Lines changed: 77 additions & 0 deletions b/‎vignettes/a3_benchmark.Rmd‎
Lines changed: 77 additions & 0 deletions
@@ -2,6 +2,34 @@ url: https://verapancaldilab.github.io/multideconv/
 template:
   bootstrap: 5
 
+navbar:
+  structure:
+    left:  [intro, articles, reference, news]
+    right: [search, github]
+  components:
+    articles:
+      text: Articles
+      menu:
+      - text: "Deconvolution with default methods"
+        href: articles/a1_deconvolution.html
+      - text: "Single-cell deconvolution"
+        href: articles/a2_single_cell.html
+      - text: "Pseudo-bulk profiles and benchmarking"
+        href: articles/a3_benchmark.html
+      - text: "Cell type subgroup analysis"
+        href: articles/a4_subgroups.html
+      - text: "Machine learning workflows"
+        href: articles/a5_machine_learning.html
+
+articles:
+- title: "Tutorial"
+  contents:
+  - a1_deconvolution
+  - a2_single_cell
+  - a3_benchmark
+  - a4_subgroups
+  - a5_machine_learning
+
 reference:
 - title: Main
   desc: >
 
@@ -0,0 +1,78 @@
+---
+title: "Deconvolution with default methods"
+output: rmarkdown::html_vignette
+vignette: >
+  %\VignetteIndexEntry{Deconvolution with default methods}
+  %\VignetteEngine{knitr::rmarkdown}
+  %\VignetteEncoding{UTF-8}
+---
+
+```{r, include = FALSE}
+knitr::opts_chunk$set(
+  collapse = TRUE,
+  comment = "#>"
+)
+```
+
+```{r setup}
+library(multideconv)
+```
+
+## **Deconvolution with default methods**
+
+The basic function is to perform cell type deconvolution using six default methods (`quanTIseq`, `DeconRNASeq`, `CIBERSORTx`, `EpiDISH`, `DWLS`, `MOMF`) and nine default signatures (see paper [Hurtado et al., 2025](https://www.biorxiv.org/content/10.1101/2025.04.29.651220v2.article-info)). The function accepts either raw counts or TPM-normalized counts as input (with genes as SYMBOL).
+
+**NOTE:** If you plan to use `CIBERSORTx`, you must provide your credentials (see README for details). The resulting deconvolution matrix is automatically saved in the `Results/` directory.
+
+The output includes all combinations of deconvolution features, method-signature-cell type.
+
+```{r, eval = FALSE}
+bulk = multideconv::raw_counts
+deconv = compute.deconvolution(raw.counts = bulk, 
+                               methods = c("Quantiseq", "Epidish", 
+                                           "DeconRNASeq", "DWLS","MOMF"), 
+                               normalized = TRUE, 
+                               return = TRUE, 
+                               file_name = "Tutorial")
+```
+
+To exclude specific methods or signatures, use the methods or signatures_exclude arguments:
+
+```{r, eval = FALSE}
+deconv = compute.deconvolution(raw.counts = bulk, 
+                               methods = c("Quantiseq", "DeconRNASeq"), 
+                               normalized = TRUE,
+                               signatures_exclude = "BPRNACan", 
+                               return = TRUE, 
+                               file_name = "Tutorial")
+```
+
+To speed up computation, `multideconv` supports parallelization. Set `doParallel = TRUE` and specify the number of workers based on your system's resources:
+
+```{r, eval = FALSE}
+deconv = compute.deconvolution(raw.counts = bulk, 
+                               methods = "DWLS", 
+                               normalized = TRUE, 
+                               return = TRUE, 
+                               file_name = "Tutorial", 
+                               doParallel = TRUE, 
+                               workers = 3)
+```
+
+## **Cell type signatures**
+
+In order to access the default signatures `multideconv` provides, you can do the following:
+
+To list all signatures
+
+```{r}
+path <- system.file("signatures/", package = "multideconv")
+list.files(path)
+```
+
+To access a specific signature
+
+```{r}
+signature = read.delim(paste0(path, "CBSX-Melanoma-scRNAseq.txt"))
+head(signature)
+```
@@ -0,0 +1,118 @@
+---
+title: "Single-cell deconvolution"
+output: rmarkdown::html_vignette
+vignette: >
+  %\VignetteIndexEntry{Single-cell deconvolution}
+  %\VignetteEngine{knitr::rmarkdown}
+  %\VignetteEncoding{UTF-8}
+bibliography: references.bib
+link-citations: yes
+colorlinks: yes
+biblio-style: apalike
+---
+
+```{r, include = FALSE}
+knitr::opts_chunk$set(
+  collapse = TRUE,
+  comment = "#>"
+)
+```
+
+```{r setup}
+library(multideconv)
+```
+
+## **Single-cell metacell construction**
+
+If single-cell data is available, we recommend generating metacells to reduce computation time and prevent session crashes. Deconvolution methods that rely on single-cell data can be computationally intensive, especially with large matrices. We suggest using a maximum of 20k cells; if your object exceeds this size, creating metacells is strongly advised. However, if your computational resources are sufficient to handle the full single-cell dataset, you may skip this step.
+
+We adapted functions from the R package hdWGCNA (@morabito2023hdwgcna; @langfelder2008wgcna) for the construction of metacells using the KNN algorithm.
+
+-   **sc_object**: Normalized gene expression matrix with genes as rows and cells as columns
+
+-   **labels_column**: Vector of cell annotations
+
+-   **samples_column**: Vector of sample IDs for each cell
+
+-   **exclude_cells**: Vector specifying which cell types to ignore during metacell construction (default is NULL)
+
+-   **min_cells**: Minimum number of cells required to construct metacells in a group
+
+-   **k**: Number of nearest neighbors used for the KNN algorithm
+
+-   **max_shared**: Maximum number of cells shared between two metacells
+
+-   **n_workers**: Number of cores to use for parallelizing metacell construction
+
+-   **min_meta**: Minimum number of metacells required for a cell type to be retained
+
+Because of space limitations, we have not included a complete single-cell object in this tutorial. However, users are expected to provide their own single-cell data and supply it to the `sc_object` parameter in the function call.
+```{r, eval=FALSE}
+metacells = create_metacells(sc_object, 
+                             labels_column = cell_labels, 
+                             samples_column = sample_labels, 
+                             exclude_cells = NULL,
+                             min_cells = 50, 
+                             k = 15, 
+                             max_shared = 15, 
+                             n_workers = 4, 
+                             min_meta = 10)
+```
+
+## **Second-generation deconvolution methods**
+
+Once the single-cell data is prepared, users can supplement the default deconvolution methods with second-generation approaches such as `AutogeneS`, `BayesPrism`, `Bisque`, `CPM`, `MuSic`, and `SCDC`. These methods learn cell-type signatures directly from annotated single-cell RNA-seq data, rather than relying on predefined static signatures (@Dietrich2024.06.10.598226), to deconvolve bulk RNA-seq profiles.
+
+-   **sc_deconv**: Boolean indicating whether to run second-generation methods
+
+-   **sc_matrix**: Normalized single-cell gene expression matrix
+
+-   **sc_metadata**: Dataframe containing single-cell metadata
+
+-   **cell_annotations**: Vector of cell type labels
+
+-   **cell_samples**: Vector of sample IDs
+
+-   **name_sc_signature**: Name to assign to the resulting signature
+
+```{r}
+metacell_obj = multideconv::metacells_data
+metacell_metadata = multideconv::metacells_metadata
+head(metacell_obj[1:5,1:5])
+head(metacell_metadata)
+```
+
+This function computes cell type deconvolution using the six default methods (`quanTIseq`, `DeconRNASeq`, `EpiDISH`, `DWLS`, `MOMF`) and `CIBERSORTx` (if credentials are provided), along with second-generation deconvolution approaches. The output includes all combinations of methods and signatures.
+
+```{r, eval=FALSE}
+deconv = compute.deconvolution(raw.counts = bulk, 
+                               normalized = TRUE, 
+                               return = TRUE, 
+                               methods = c("Quantiseq", "Epidish", "DeconRNASeq"),
+                               file_name = "Tutorial", 
+                               sc_deconv = TRUE, 
+                               sc_matrix = metacell_obj, 
+                               sc_metadata = metacell_metadata, 
+                               methods_sc = c("Autogenes", "BayesPrism", 
+                                              "Bisque", "CPM", "MuSic", "SCDC"),
+                               cell_label = "annotated_ct", 
+                               sample_label = "sample", 
+                               name_sc_signature = "Test")
+```
+
+To run only the second-generation deconvolution methods based on single-cell data, without using any static cell-type signatures, use the following:
+
+```{r, eval=FALSE}
+deconv_sc = compute_sc_deconvolution_methods(raw_counts = bulk, 
+                                             normalized = TRUE, 
+                                             methods_sc = c("Autogenes", "BayesPrism", 
+                                                            "Bisque", "CPM", "MuSic", "SCDC"),
+                                             sc_object = metacell_obj, 
+                                             sc_metadata = metacell_metadata, 
+                                             cell_annotations = "annotated_ct", 
+                                             samples_ids = "sample", 
+                                             name_object = "Test", 
+                                             n_cores = 2, 
+                                             return = TRUE, 
+                                             file_name = "Tutorial")
+```
@@ -0,0 +1,77 @@
+---
+title: "Pseudo-bulk profiles and benchmarking"
+output: rmarkdown::html_vignette
+vignette: >
+  %\VignetteIndexEntry{Pseudo-bulk profiles and benchmarking}
+  %\VignetteEngine{knitr::rmarkdown}
+  %\VignetteEncoding{UTF-8}
+---
+
+```{r, include = FALSE}
+knitr::opts_chunk$set(
+  collapse = TRUE,
+  comment = "#>"
+)
+```
+
+```{r setup}
+library(multideconv)
+metacell_obj = multideconv::metacells_data
+metacell_metadata = multideconv::metacells_metadata
+```
+
+## **Pseudo-bulk profiles**
+
+To create pseudo-bulk profiles from the original single-cell objects, simulating a bulk RNA-seq dataset, you can use the following function:
+
+**NOTE:** You can input either your original single-cell object or the metacell object. Just be sure to select the same object when examining the real cell proportions (if needed).
+
+```{r}
+metacells_seurat = Seurat::CreateSeuratObject(metacell_obj, meta.data = metacell_metadata)
+pseudobulk = create_sc_pseudobulk(metacells_seurat, cells_labels = "annotated_ct", sample_labels = "sample", normalized = TRUE, file_name = "Tutorial")
+```
+
+## **Creating cell type signatures**
+
+To create cell type signatures, `multideconv` uses four methods: `CIBERSORTx`, `DWLS`, `MOMF`, and `BSeq-SC`. You must provide single-cell data as input. Signatures are saved in the `Results/custom_signatures` directory, and returned as a list. From now and after `compute.deconvolution()` will use these signatures additionally to the default ones! So if you would like to have the deconvolution results based on your new files, make sure to run `compute.deconvolution()`
+
+To run `BSeq-SC`, supply the `cell_markers` argument, which should contain the differential markers for each cell type (these can be obtained using `FindMarkers()` or `FindAllMarkers()` from Seurat).
+
+```{r, eval=FALSE}
+bulk_pseudo = multideconv::pseudobulk
+signatures = create_sc_signatures(metacell_obj, 
+                                  metacell_metadata, 
+                                  cells_labels = "annotated_ct", 
+                                  sample_labels = "sample", 
+                                  bulk_rna = bulk_pseudo, 
+                                  cell_markers = NULL, 
+                                  name_signature = "Test",
+                                  methods_sig = c("DWLS", "CIBERSORTx", "MOMF", "BSeqsc"))
+```
+
+## **Cell types signatures benchmark**
+
+To validate the generated signatures, we provide a benchmarking function to compare deconvolution outputs against known cell proportions (e.g., from single-cell or imaging data). The `cells_extra` argument should include any non-standard cell types present in your ground truth. Make sure cell names match those in the deconvolution matrix (e.g., use B.cells instead of B cells if that is the naming convention used - see README for more information).
+
+```{r}
+deconv_pseudo = multideconv::deconvolution
+cells_groundtruth = multideconv::cells_groundtruth
+benchmark = compute.benchmark(deconv_pseudo, 
+                              cells_groundtruth, 
+                              cells_extra = c("Mural.cells", "Myeloid.cells"), 
+                              corr_type = "spearman",
+                              scatter = FALSE, 
+                              plot = TRUE, 
+                              pval = 0.05, 
+                              file_name = "Tutorial", 
+                              width = 10, 
+                              height = 15)
+```
+
+```{r hdwgcna-figure, echo=FALSE, fig.align='center', out.width='70%'}
+knitr::include_graphics("Results/Benchmark.png")
+```
+
+::: {style="text-align: center;"}
+<em>Figure 1. Example of performance of different methods and signature combinations on the pseudo bulk.</em>
+:::