@peakSet Column Names #867
-
Hi, I was just curious about the meaning for the column names for the proj@peakSet command. What does score, replicateScoreQuantile, and groupScoreQuantile mean and how are they calculated? Additionally, how do distToGeneStart and distToTSS differ? Is this due to an gene allele having a potential different TSS due to a mutation? Thanks! |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 2 replies
-
Apologies for the delay. This wasnt a short answer. When you create a The difference between |
Beta Was this translation helpful? Give feedback.
-
Hi, as a follow up, I am trying to find how the overlap between the ArchR Clusters and some peaks of interest. I then want to normalize by the amount of peaks. My peakSet has the group replicate as the cluster, then either Rep1 or Rep2 (below is an example for C1). Is there any significant differences between the 2, or just an output of the peak calling? Also, is there another way to determine the amount of peaks per cluster, or just this? Thank you so much!
|
Beta Was this translation helpful? Give feedback.
-
Ok, that's what I thought. Thanks! |
Beta Was this translation helpful? Give feedback.
Apologies for the delay. This wasnt a short answer.
When you create a
peakSet
usingaddReproduciblePeakSet()
this runs an iterative overlap procedure where peaks are identified based on the pseudobulk replicates created usingaddGroupCoverages()
. Using these psuedobulk replicates, ArchR identifies peaks that are reproducible across those replicates for each group indicated by thegroupBy
argument. The score of those peak calls is stored in thescore
column and the quantile rank of that score for the individual pseudobulk replicate that it was identified in is stored inreplicateScoreQuantile
. After creating these replicate-based peak calls, ArchR then creates a merged/union peak set acros…