Cell aggregation for co-accessibility score calculation #981

isaLa42 · 2021-08-16T12:58:05Z

isaLa42
Aug 16, 2021

Hi,

I have a question regarding the cell aggregation prior to co-accessibility score calculation (also relevant for Peak2GeneLinkage). If I am not mistaken, you aggregate 100 cells in 500 groups as a default. The original implementation in Cicero only combines 50 cells per aggregate. Is there a reason why you increased this number?

Additionally, I am a bit worried with the "duplication" of cells in multiple aggregates/groups. If one does not adapt the aggregation settings (and does not have a dataset with > 50 000 cells), many cells will likely be "drawn" multiple times during aggregation. In the original Cicero publication, Pliner et al. discuss that "groups will sometimes contain some of the same cells, which could in principle inflate co-accessibility scores across cells." In their analysis, they kept the median number of cells shared between pairs of groups to zero. Since you don't discuss this issue in your manual, what is your take on the "duplication" of cells in multiple aggregates and their consequent inflation of correlation coefficients? Can we trust correlation of identical cells in aggregates?

I see increasing numbers of co-accessible links with increasing "cell duplication rates" during aggregation. I adapted your function to draw cells only once for aggregation. This reduced the number of detected links and their strengths, but I still find 3 links per peak on average (vs. 20 links per peak for cell duplication rates of 10). I feel a bit more comfortable with the correlation of accessibility in these independent cell aggregates. What is your take on this?

Thank you very much in advance for your answer.

Best,
Isabelle

Answered by rcorces

Aug 16, 2021

Hi Isabelle,
I think your interpretations are all correct. I believe we increased the size of the aggregates to 100 because we found this to work better, especially in larger datasets. But you can change this as you wish.

In their analysis, they kept the median number of cells shared between pairs of groups to zero. Since you don't discuss this issue in your manual, what is your take on the "duplication" of cells in multiple aggregates and their consequent inflation of correlation coefficients? Can we trust correlation of identical cells in aggregates?

The same caveats apply. If your dataset is small you should adjust the default parameters as there isnt a one-size-fits-all solution. Ho…

View full answer

rcorces · 2021-08-16T15:09:40Z

rcorces
Aug 16, 2021
Maintainer

Hi Isabelle,
I think your interpretations are all correct. I believe we increased the size of the aggregates to 100 because we found this to work better, especially in larger datasets. But you can change this as you wish.

In their analysis, they kept the median number of cells shared between pairs of groups to zero. Since you don't discuss this issue in your manual, what is your take on the "duplication" of cells in multiple aggregates and their consequent inflation of correlation coefficients? Can we trust correlation of identical cells in aggregates?

The same caveats apply. If your dataset is small you should adjust the default parameters as there isnt a one-size-fits-all solution. However, I believe we provide the ability to adjust all of the necessary parameters in both addCoAccessibility() and addPeak2GeneLink(). Please let me know if you feel that isnt correct.

2 replies

isaLa42 Aug 17, 2021
Author

Thanks for your reply. Yes, you have all necessary parameters in the functions to adjust cell aggregation.
However, it might be useful to add a few sentences to the manual that these parameters should be adjusted to the individual nature of the dataset. It might be just my ignorance, but I did not understand their importance by simply following the instructions provided.

rcorces Aug 17, 2021
Maintainer

Got it. thanks. I've made a note to add this to the documentation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Cell aggregation for co-accessibility score calculation #981

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Cell aggregation for co-accessibility score calculation #981

Uh oh!

isaLa42 Aug 16, 2021

Replies: 1 comment · 2 replies

Uh oh!

rcorces Aug 16, 2021 Maintainer

Uh oh!

isaLa42 Aug 17, 2021 Author

Uh oh!

rcorces Aug 17, 2021 Maintainer

isaLa42
Aug 16, 2021

Replies: 1 comment 2 replies

rcorces
Aug 16, 2021
Maintainer

isaLa42 Aug 17, 2021
Author

rcorces Aug 17, 2021
Maintainer