Creating Pseudobulk Replicates for Differential Analysis #1992
-
Hello, First off, to the ArchR creators/moderators - thank you for creating and maintaining the ArchR pipeline! I've been learning about it and using it to analyze my data for the past month, and it has been very helpful & user-friendly. The question I have is regarding ArchR's process for creating pseudo-bulk replicates for peak calling & differential peak analysis. My ArchR project contains 4 samples, made up of neural progenitor cells from 4 individuals: 1 control, and 3 affected persons (each with different disease states). My overall goal for analysis in ArchR is to compare control vs. disease states in terms of marker genes, marker peaks, and TF enrichment, with hypothesis generation in mind. Each of the four samples are distinct, and cannot be considered replicates of any one condition. My concern is that, when pseudo-bulk replicates are created, single cells from different samples will be grouped together. I've read the full ArchR manual in detail, particularly this page; it is my understanding that, for a given cluster, if there are not enough cells in one sample, cells will be combined from multiple samples in a sample-agnostic manner. Due to the sample x cluster breakdown of my dataset (see table below), I believe that the mixing of samples in pseudo-bulk replicates is inevitable.
Will this affect my ability to do differential comparison, e.g. disease vs. control, in downstream analyses? I've read through your suggestions in Discussion #696, #1093, and #1272, which has provided some useful options. For example, I have created a new column in cellColData to represent the product of cluster and condition for each cell. However, I'm not sure how to use this in creating pseudo-bulk replicates or performing peak calling. Would it make sense in this case to pass "Sample" instead of "Clusters" to the groupBy parameter, if I am more interested in sample comparison than differentiating between clusters? I know this question is more about the adaptation of ArchR analysis to my unique dataset, but any insight/feedback you have would be appreciated! Thanks, |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 2 replies
-
Your clusters are extremely sample-specific so I dont think that making cluster-level pseudobulks is very helpful for you. I cant say what the right approach is for your analysis but if you're just looking for differential testing, then the pseudobulks arent actually used. in stead, you use a column (that you could create) in cellColData. So you can define whatever groupings you want |
Beta Was this translation helpful? Give feedback.
Your clusters are extremely sample-specific so I dont think that making cluster-level pseudobulks is very helpful for you. I cant say what the right approach is for your analysis but if you're just looking for differential testing, then the pseudobulks arent actually used. in stead, you use a column (that you could create) in cellColData. So you can define whatever groupings you want