@peakSet Column Names #867

agalianese · 2021-07-02T20:34:23Z

agalianese
Jul 2, 2021

Hi, I was just curious about the meaning for the column names for the proj@peakSet command. What does score, replicateScoreQuantile, and groupScoreQuantile mean and how are they calculated? Additionally, how do distToGeneStart and distToTSS differ? Is this due to an gene allele having a potential different TSS due to a mutation? Thanks!

Answered by rcorces

Jul 6, 2021

Apologies for the delay. This wasnt a short answer.

When you create a peakSet using addReproduciblePeakSet() this runs an iterative overlap procedure where peaks are identified based on the pseudobulk replicates created using addGroupCoverages(). Using these psuedobulk replicates, ArchR identifies peaks that are reproducible across those replicates for each group indicated by the groupBy argument. The score of those peak calls is stored in the score column and the quantile rank of that score for the individual pseudobulk replicate that it was identified in is stored in replicateScoreQuantile. After creating these replicate-based peak calls, ArchR then creates a merged/union peak set acros…

View full answer

rcorces · 2021-07-06T16:09:41Z

rcorces
Jul 6, 2021
Maintainer

Apologies for the delay. This wasnt a short answer.

When you create a peakSet using addReproduciblePeakSet() this runs an iterative overlap procedure where peaks are identified based on the pseudobulk replicates created using addGroupCoverages(). Using these psuedobulk replicates, ArchR identifies peaks that are reproducible across those replicates for each group indicated by the groupBy argument. The score of those peak calls is stored in the score column and the quantile rank of that score for the individual pseudobulk replicate that it was identified in is stored in replicateScoreQuantile. After creating these replicate-based peak calls, ArchR then creates a merged/union peak set across the group. The quantile rank of the peak score across the group is stored in groupScoreQuantile and the group it comes from is stored in GroupReplicate.

The difference between distToGeneStart and distToTSS depend on your geneAnnotation.
gene starts are determined by resize(geneAnnotation$genes, 1, "start")
TSSs are determined by resize(geneAnnotation$TSS, 1, "start")
In the default hg38 geneAnnotation, there are 23,274 genes and 49,052 TSSs.

1 reply

agalianese Jul 6, 2021
Author

Great! Thank you so much!

agalianese · 2021-09-24T18:39:49Z

agalianese
Sep 24, 2021
Author

Hi, as a follow up, I am trying to find how the overlap between the ArchR Clusters and some peaks of interest. I then want to normalize by the amount of peaks.

My peakSet has the group replicate as the cluster, then either Rep1 or Rep2 (below is an example for C1). Is there any significant differences between the 2, or just an output of the peak calling? Also, is there another way to determine the amount of peaks per cluster, or just this? Thank you so much!

C1._.Rep1 | 852 |   
C1._.Rep2 | 1974

1 reply

rcorces Sep 26, 2021
Maintainer

In the future, it helps other users more if you open a new discussion topic so that things are easier to search and find.

For what its worth, I dont understand your question enough to provide a clear answer. I think its important to remember that peak calling is highly imperfect, highly dependent on total sequencing depth (or cells per cluster in scATAC), and doing raw overlaps or looking at the number of peak calls from a given sample is somewhat dubious.

agalianese · 2021-09-27T15:34:04Z

agalianese
Sep 27, 2021
Author

Ok, that's what I thought. Thanks!

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

@peakSet Column Names #867

Uh oh!

{{title}}

Uh oh!

Replies: 3 comments 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

@peakSet Column Names #867

Uh oh!

agalianese Jul 2, 2021

Replies: 3 comments · 2 replies

Uh oh!

rcorces Jul 6, 2021 Maintainer

Uh oh!

agalianese Jul 6, 2021 Author

Uh oh!

Uh oh!

agalianese Sep 24, 2021 Author

Uh oh!

rcorces Sep 26, 2021 Maintainer

Uh oh!

agalianese Sep 27, 2021 Author

agalianese
Jul 2, 2021

Replies: 3 comments 2 replies

rcorces
Jul 6, 2021
Maintainer

agalianese Jul 6, 2021
Author

agalianese
Sep 24, 2021
Author

rcorces Sep 26, 2021
Maintainer

agalianese
Sep 27, 2021
Author