normalization of exported bigwigs files #734

Brawni · 2021-05-06T15:52:52Z

Brawni
May 6, 2021

Hello!

This is more of a naive question and not a bug but I cant specify other labels than 'bug' anymore. When exporting bigwig files using getGroupBW function wouldn't it make sense to normalise by number of cells along with the ReadsInTSS per cluster given that the coverage are represented by sum of fragments across cells ?

Thank you!

Answered by rcorces

May 6, 2021

@Brawni - We moved questions and feature requests to the Discussions section and only bug reports go into Issues now. I'm migrating your post to Discussions.

To answer your question - I'm not sure number of cells makes sense. What if you have multiple samples in your dataset and one of them is sequenced to 1/10th the depth? Then by normalizing to number of cells you are skewing the data. The ReadsInTSS is a pretty stable normalization that ignores how many cells (they are being grouped together into a single track after all) and only cares about the total number of fragments. Thinking of it another way, if you did bulk RNA-seq on 5000 cells and on 50,000 cells, would you normalize the gen…

View full answer

rcorces · 2021-05-06T16:22:24Z

rcorces
May 6, 2021
Maintainer

@Brawni - We moved questions and feature requests to the Discussions section and only bug reports go into Issues now. I'm migrating your post to Discussions.

To answer your question - I'm not sure number of cells makes sense. What if you have multiple samples in your dataset and one of them is sequenced to 1/10th the depth? Then by normalizing to number of cells you are skewing the data. The ReadsInTSS is a pretty stable normalization that ignores how many cells (they are being grouped together into a single track after all) and only cares about the total number of fragments. Thinking of it another way, if you did bulk RNA-seq on 5000 cells and on 50,000 cells, would you normalize the gene expression counts based on the number of cells input? The normal conventions (TPM/RPKM) dont take into account cell number. Same thing applies to ATAC-seq in a way though we use ReadsInTSS instead of total reads because this also accounts for data quality. That being said, you can create arbitrary columns in cellColData and use those for normalization to see what happens.

3 replies

Brawni May 6, 2021
Author

Hi Ryan!

Thanks for your prompt reply and sorry to have posted in the wrong section. So the resulting track coverage takes the average of the sum of fragments in each cell per bin and corrects by ReadsInTSS?

rcorces May 6, 2021
Maintainer

Not quite. getGroupBW() performs normalization at the pseudo-bulk level, not at the per-cell level. So it takes the sum of fragments from all cells in your group and normalizes that to the sum of ReadsInTSS from all cells in your group. The normalization is not performed on a per-cell basis.

Also, looking at the code, I dont think that arbitrary normalization based on any entry in cellColData is allowed (I misspoke before). It can only be ReadsInTSS ReadsInPromoters nFrags or None:

ArchR/R/GroupExport.R

Line 352 in 968e442

if(tolower(normMethod) %in% c("readsintss", "readsinpromoter", "nfrags")){

Brawni May 6, 2021
Author

Got it! Thanks!!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

normalization of exported bigwigs files #734

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment 3 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

normalization of exported bigwigs files #734

Uh oh!

Uh oh!

Brawni May 6, 2021

Replies: 1 comment · 3 replies

Uh oh!

rcorces May 6, 2021 Maintainer

Uh oh!

Brawni May 6, 2021 Author

Uh oh!

rcorces May 6, 2021 Maintainer

Uh oh!

Brawni May 6, 2021 Author

Brawni
May 6, 2021

Replies: 1 comment 3 replies

rcorces
May 6, 2021
Maintainer

Brawni May 6, 2021
Author

rcorces May 6, 2021
Maintainer

Brawni May 6, 2021
Author