peak2gene links parameters and reproducibility #1728

wangmhan · 2022-11-08T10:13:04Z

wangmhan
Nov 8, 2022

Thank you very much for the nice package!

I would like to reconstruct regulatory network, thus use peak2gene links as the basis to identify the correlated region and target gene pairs.
As our data had 2 continuous stages, and we saw batch bias. We independently run the analysis for each stage. Surprisingly, the peak2gene links we identified has very little overlap if we use the term to compare: peakName + positive/negative correlation + geneName. While the overlap is improved if we only checked peakName, but still less than 50%. (which I saw >50% overlap in the cortex paper with scATAC only and scMultiome data)
Do you have any suggestions or comment why the links can be so different from very similar embryonic stages? And is it a way to optimise, so that the result of peak2gene links can be more reproducible? I'm now using the default parameters, except changed impute=F. I tuned k as a test run, when k is lower the number of links decreased, in general. So that when cell number n>=200, I kept with k=100.

One more question, my sample size is ~3k-6k cells. I had fine clustering, so that can get a better resolution for subtype, sample size would be 200-1k. Would you recommend to do the analysis with overall population, or subtype respectively? We are more interested in subtype network, but also aware that if reduce sample size will reduce variability, thus lose detective power.

Looking forward to the reply & thank you in advance!

rcorces · 2022-11-08T13:31:43Z

rcorces
Nov 8, 2022
Maintainer

I dont have much advice to provide here as you're doing something that I have never done. Peak 2 gene links requires variability to drive the correlations. So it really doesnt make sense to perform peak2gene link identification separately in order to identify sample-specific links. In fact, you may lose sample specific links by doing this because you squash variation. It also sounds like you dont have very many cells in your dataset.

0 replies

wangmhan · 2022-11-08T14:36:43Z

wangmhan
Nov 8, 2022
Author

Hi Ryan, Thank you for the quick reply. I'm not aiming to do sample wise analysis. The reason why I am doing so is simply because we have a strong batch effect. Though it can be eliminated by Harmony on projections, the raw data is not changed. Do you have any suggestions to remove these batch factors from the raw data, so that I can merge my data, and run peak2gene analysis? The reason why I struggled a lot with peak2gene link analysis is because it is important for us in two reasons. One, it can give a list of candidate enhancers. Second, it is the basis of regulatory network, as we really want to get the potential TF-target pairs using "peak regions" as a bridge. And yes, we don't have a big sample size. It also limited the analysis. Is there a suggested sample size big enough for this analysis? Thanks again!

…

On Tue, 8 Nov 2022 at 14:31, Ryan Corces ***@***.***> wrote: I dont have much advice to provide here as you're doing something that I have never done. Peak 2 gene links requires variability to drive the correlations. So it really doesnt make sense to perform peak2gene link identification separately in order to identify sample-specific links. In fact, you may lose sample specific links by doing this because you squash variation. It also sounds like you dont have very many cells in your dataset. — Reply to this email directly, view it on GitHub <#1728 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AJVYTVXXPVKADIT4HAPYDILWHJI4VANCNFSM6AAAAAAR2EMD6Y> . You are receiving this because you authored the thread.Message ID: ***@***.***>

0 replies

rcorces · 2022-11-08T16:22:25Z

rcorces
Nov 8, 2022
Maintainer

Though it can be eliminated by Harmony on projections, the raw data
is not changed. Do you have any suggestions to remove these batch factors
from the raw data, so that I can merge my data, and run peak2gene analysis?

You dont need to adjust the raw data (nor is that really possible). Just use the harmony reduced dimensions and this will limit the contribution of batch effect.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

peak2gene links parameters and reproducibility #1728

Uh oh!

{{title}}

Uh oh!

Replies: 3 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

peak2gene links parameters and reproducibility #1728

Uh oh!

wangmhan Nov 8, 2022

Replies: 3 comments

Uh oh!

rcorces Nov 8, 2022 Maintainer

Uh oh!

wangmhan Nov 8, 2022 Author

Uh oh!

rcorces Nov 8, 2022 Maintainer

wangmhan
Nov 8, 2022

rcorces
Nov 8, 2022
Maintainer

wangmhan
Nov 8, 2022
Author

rcorces
Nov 8, 2022
Maintainer