You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: content/03.results.md
+7-13Lines changed: 7 additions & 13 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -207,18 +207,12 @@ Across all projects, we observed a positive relationship between bulk and pseudo
207
207
208
208
We next performed an overrepresentation analysis to probe for differences in gene expression that might suggest differences in cell type composition and/or abundance between modalities.
209
209
To this end, we calculated the per-gene median of each project's model residuals and identified outliers, where "positive outliers" are genes with higher bulk RNA-seq expression than expected from pseudobulk expression, and conversely "negative outliers" are genes with lower bulk RNA-seq expression than expected from pseudobulk expression.
210
-
Using marker gene sets associated with consensus cell types, we calculated the odds ratio in each direction as the odds a cell type marker gene is present in the given outlier direction compared to other genes.
210
+
Using cell type marker gene sets from each project's respective `CellAssign` reference, we calculated the odds ratio in each direction as the odds a cell type marker gene is present in the given outlier direction compared to other genes.
211
211
Following permutation testing and P-value correction to control the FDR at 5\%, we found several cell type marker gene sets with higher, but never lower, bulk RNA-seq expression than expected (Figure {@fig:fig6}B, Figure {@fig:figS7}B).
212
212
213
-
214
-
In brain and CNS tumors, the marker gene sets overrepresented in bulk RNA-seq expression primarily corresponded to stromal (e.g., endothelial and extracellular matrix secreting cells) and/or neuronal cell types (e.g., glial cells and astrocytes), all of which are known to be prevalent non-immune cells in glioma tumor microenvironments [@doi:10.3389/fimmu.2023.1227126; @doi:10.3389/fphar.2024.1355242] (Figure {@fig:fig6}B).
215
-
<!-- TODO: Clarify this sentence; what is the exception to? What is the result in the gliomas? (Also is this one exception or exceptions plural?) -->
216
-
In addition, monocyte marker genes were overrepresented in bulk RNA-seq expression for `SCPCP000009` (brain and CNS tumors), which was sequenced at the single-nuclei level, but not in projects `SCPCP000001` (high-grade gliomas) and `SCPCP000002` (low-grade gliomas), which were sequenced at the single-cell level.
217
-
This difference may reflect the increased sensitivity of single-cell approaches to detecting immune cells relative to single-nuclei approaches [@doi:10.4132/jptm.2022.12.19].
218
-
219
-
Given that our consensus cell type analysis identified various immune cells from high- and low-grade gliomas (Figure {@fig:fig4}C-D), these results suggest that non-immune cells may have been lost during single-cell library preparation.
220
-
Indeed, several of these overrepresented bulk cell types for `SCPCP000001` and `SCPCP000002` are not found in the single-cell consensus cell types annotations (`SCPCP000001`: "blood vessel endothelial cell", "extracellular matrix secreting cell", "pericyte"; `SCPCP000002`: "blood vessel endothelial cell", "extracellular matrix secreting cell", "microvascular endothelial cell"), further emphasizing the potential loss of these cell types in the single-cell data.
221
-
222
-
By contrast, we uncovered a variety of both immune and non-immune cell types overrepresented in bulk RNA-seq `SCPCP000017` (osteosarcoma; Figure {@fig:figS7}B), all of which were present in the single-nuclei consensus cell types for this project.
223
-
This observation may reflect inherent challenges in dissociating bone tissue [@doi:10.1186/s12885-023-10977-1].
224
-
These results show that, while bulk and single-cell or single-nuclei expression is indeed highly correlated, cell type differences may still be present between modalities, potentially driven by cell-type-specific loss in single-cell experiments.
213
+
In brain and CNS tumors, the marker gene sets overrepresented in bulk RNA-seq expression primarily corresponded to stromal (e.g., Endothelial cells and Pericytes) and/or neuronal cell types (e.g., Astrocytes and various types of glial cells), all of which are prevalent non-immune cells in glioma tumor microenvironments [@doi:10.3389/fimmu.2023.1227126; @doi:10.3389/fphar.2024.1355242] (Figure {@fig:fig6}B).
214
+
Interestingly, only Monocytes and neuronal cell types, but no stromal cells, were overrepresented in bulk RNA-seq for `SCPCP000009` (brain and CNS tumors).
215
+
As `SCPCP000009` was sequenced at the single-nuclei level but `SCPCP000001` (high-grade gliomas) and `SCPCP000002` (low-grade gliomas) were sequenced at the single-cell level, this difference may reflect the increased sensitivity of single-cell approaches to detecting immune cells relative to single-nuclei approaches [@doi:10.4132/jptm.2022.12.19].
216
+
Indeed, the other single-nuclei projects considered here also identified immune cell types as overrepresented in bulk RNA-seq: Monocytes were identified for `SCPCP000006` (Wilms Tumor), and a combination of immune and non-immune cell types were identified for `SCPCP000017` (osteosarcoma; Figure {@fig:figS7}B).
217
+
The diversity of cell types overrepresented in osteosarcoma bulk RNA-seq samples may also reflect inherent challenges in dissociating bone tissue [@doi:10.1186/s12885-023-10977-1].
218
+
In total, we observed that while bulk and single-cell or single-nuclei expression is indeed highly correlated, cell type differences may still be present between modalities potentially influenced by cell-type-specific loss in single-cell experiments.
Copy file name to clipboardExpand all lines: content/04.methods.md
+4-5Lines changed: 4 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -257,15 +257,14 @@ For each project, we then used the `lme4` R package [@doi:10.18637/jss.v067.i01]
257
257
258
258
#### Overrepresentation analysis
259
259
260
-
To ascertain whether certain cell types might be overrepresented in one modality compared to the other, we first identified cell types of interest as the set of all possible consensus cell types for each project.
261
-
We then created a gene set for each consensus cell type using the project's `CellAssign` marker gene reference.
262
-
Because a consensus cell type can encompass multiple cell types in the marker gene reference, we defined each consensus cell type's gene set as the union of all marker genes for each of its constituent reference cell types.
260
+
We next conducted overrepresentation analysis (ORA) to ascertain whether certain cell types might be overrepresented either modality (bulk vs. pseudobulk).
261
+
We specifically tested overrepresentation of the `PanglaoDB` cell type marker gene sets used for each project's respective `CellAssign` reference.
263
262
264
-
For input to the overrepresentation analysis, we summarized model residuals within each project by taking the median residual for each gene across samples and then transformed these summarized residuals into Z-scores.
263
+
For input to the ORA, we summarized model residuals within each project by taking the median residual for each gene across samples and then transformed these summarized residuals into Z-scores.
265
264
We identified outlier genes as those with Z-scores greater than 2.5 (positive outliers) or less than -2.5 (negative outliers).
266
265
In this case, positive outliers represent genes with comparatively higher expression in the bulk modality, and negative outliers represent genes with comparatively higher expression in the single-cell modality.
267
266
268
-
For each consensus cell type gene set, we calculated two odds ratios representing whether genes were overrepresented in the positive outliers (enriched in bulk) or negative outliers (enriched in pseudobulk).
267
+
For each set of cell type marker genes, we calculated two odds ratios representing whether genes were overrepresented in the positive outliers (enriched in bulk) or negative outliers (enriched in pseudobulk).
269
268
We calculated P-values for both the bulk and pseudobulk enrichment directions via permutation testing with 10,000 replicates.
270
269
We defined gene sets with significant overrepresentation as those with a false-discovery-rate-corrected P-value ≤ 0.05 [@doi:10.1111/j.2517-6161.1995.tb02031.x].
Copy file name to clipboardExpand all lines: content/100.figure-table-legends.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -155,7 +155,7 @@ Results from additional projects are shown in Figure {@fig:figS7}A.
155
155
156
156
B. Odds ratios from overrepresentation analysis for the same samples shown in panel A, colored by FDR-corrected significance.
157
157
Each odds ratio represents the odds that marker genes for the given cell type were overrepresented in bulk RNA-seq when compared to single-cell/nuclei RNA-seq, relative to other genes.
158
-
A total of 36 consensus cell types were evaluated for each project shown here.
158
+
A total of 68 cell types were evaluated for each project shown here.
159
159
Results from additional projects are shown in Figure {@fig:figS7}B.
160
160
161
161
## Supplementary Figures and Tables {.page_break_before}
@@ -314,4 +314,4 @@ The regression line is also shown for each project.
314
314
315
315
B. Odds ratios from overrepresentation analysis for the same samples shown in panel A, colored by FDR-corrected significance.
316
316
Each odds ratio represents the odds that marker genes for the given cell type were overrepresented in the bulk modality, relative to other genes.
317
-
31 consensus cell types were evaluated for project `SCPCP000006`, and 37 consensus cell types were evaluated for project `SCPCP000017`.
317
+
44 cell types were evaluated for project `SCPCP000006`, and 50 cell types were evaluated for project `SCPCP000017`.
0 commit comments