You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+20-5Lines changed: 20 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,25 +7,39 @@
7
7
-[Installation](#installation)
8
8
-[Tutorial](#tutorial)
9
9
-[Citation](#citation)
10
-
-[Contact](#Contact)
10
+
-[Contact](#contact)
11
11
12
12
## Introduction
13
13
14
-
Single-cell RNA sequencing (scRNA-seq) maps gene expression heterogeneity within a tissue. However, identifying biological signals in this data is challenging due to confounding technical factors, noise, sparsity, and high dimensionality. Data factorization methods address this by separating and identifying signals, such as gene expression programs, in the data, but the resulting factors must be manually interpreted. We developed sciRED as a tool to improve the interpretation of scRNA-seq factor analysis. sciRED has four steps: 1) removing known confounding effects and using rotations to improve factor interpretability; 2) mapping factors to known covariates; 3) identifying unexplained factors that may capture hidden biological phenomena; and 4) determining the genes and biological processes represented by the resulting factors. We apply sciRED to multiple scRNA-seq data sets and identify sex-specific variation in a kidney map, discern strong and weak stimulation signals in a PBMC dataset, reduce ambient RNA contamination in a rat liver atlas to help identify strain variation, and reveal rare cell type signatures and anatomical zonation gene programs in a healthy human liver map. These demonstrate that sciRED is useful in characterizing diverse biological signals within scRNA-seq datasets.
14
+
Single-cell RNA sequencing (scRNA-seq) maps gene expression heterogeneity within a tissue. However, identifying biological signals in this data is challenging due to confounding technical factors, noise, sparsity, and high dimensionality. Data factorization methods address this by separating and identifying signals, such as gene expression programs, but the resulting factors require manual interpretation.
15
+
16
+
We developed sciRED to enhance the interpretation of scRNA-seq factor analysis. sciRED follows four steps:
17
+
1. Removing known confounding effects and using rotations to improve factor interpretability.
18
+
2. Mapping factors to known covariates.
19
+
3. Identifying unexplained factors that may capture hidden biological phenomena.
20
+
4. Determining the genes and biological processes represented by the resulting factors.
21
+
22
+
We applied sciRED to multiple scRNA-seq datasets and demonstrated its utility in:
23
+
- Identifying general and cell-type specific covariate-related variations, such as sex-specific variations in a kidney map, discerning strong and weak stimulation signals in a PBMC dataset, and general and cell-type specific strain variation within a rat liver atlas.
24
+
- Employing a cluster-free approach to identify and guide the annotation of cell type identity programs.
25
+
- Decomposing signal and noise, such as eliminating ambient RNA contamination in a rat liver atlas to unveil strain variations.
26
+
- Evaluating unannotated factors to reveal hidden biology in a healthy human liver map, represented by anatomical zonation gene programs, T cell-specific cell cycle signatures, and two rare cell type signatures that were missed in the original study.
15
27
16
28
17
29
## Installation
18
30
Please make sure to install the following packages **before installing sciRED**:
Some of the prerequest packages require Numba package for parallel implementation. Please install order versions of numpy (such as 1.22.4) in case you case you came across the following error:
40
+
**Common issues**\
41
+
Some of the prerequisite packages require the Numba package for parallel implementation. Please install older versions of numpy (such as 1.22.4) in case you encounter the following error:
42
+
29
43
```bash
30
44
Numba needs NumPy 1.24 or less
31
45
```
@@ -34,10 +48,11 @@ Numba needs NumPy 1.24 or less
34
48
35
49
Follow [tutorial-1](https://github.com/delipouya/sciRED/blob/main/tutorial1_scMixology.ipynb) and [tutorial-2](https://github.com/delipouya/sciRED/blob/main/tutorial2_stimulatedPBMC.ipynb) to learn how to use sciRED. These tutorials introduce the standard processing pipeline and demonstrate the application of sciRED on the scMixology and stimulated PBMC datasets. Further details about the input datasets are available in the manuscript. The data processing scripts are available in the _data_prep_ folder.
36
50
51
+
37
52
## Citation
38
53
39
54
If you find sciRED useful for your publication, please cite:
40
55
[Pouyabahar et al. Interpretable single-cell factor decomposition using sciRED.](url)
0 commit comments