Why is k=100 default for creating kNN aggregates? #1518

RegnerM2015 · 2022-07-22T20:05:45Z

RegnerM2015
Jul 22, 2022

I am trying to gain more understanding and intuition regarding the creation of low overlapping aggregates or metacells in ArchR procedures such as addPeak2GeneLinks(). The default size of these aggregates or groups is 100 cells:

addPeak2GeneLinks(
  ArchRProj = NULL,
  reducedDims = "IterativeLSI",
  useMatrix = "GeneIntegrationMatrix",
  dimsToUse = 1:30,
  scaleDims = NULL,
  corCutOff = 0.75,
  cellsToUse = NULL,
  k = 100
)

Do you have any ideas on how the downstream results may be affected if k were increased or decreased? Perhaps a lower k may lead to more spurious correlations, while a higher k would lead to more informative aggregates (albeit, with less power as there would be less aggregates to test the correlations on). Did you every benchmark this parameter to find out how it affects the number of significant P2Gs identified?

Thank you for your help! I look forward to gaining a deeper understanding of this process.

rcorces · 2022-07-23T22:38:47Z

rcorces
Jul 23, 2022
Maintainer

I do not believe that we ever benchmarked this parameter. But I also dont believe that benchmarking it is super informative for broad application. The specific value for k is largely dependent on your dataset size and the size of your clusters. I believe that higher k with small clusters would lead to cross-cluster inclusion in the low-overlapping cell aggregates. k=100 is a pretty good starting point because if you cluster has really few cells (less than 200) then you arent going to be able to do much with it anyways. And aggregating less than 100 cells might not give you enough signal to be informative / avoid high amounts of noise.

1 reply

RegnerM2015 Jul 29, 2022
Author

Thank you for discussing! I appreciate your insights.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Why is k=100 default for creating kNN aggregates? #1518

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Why is k=100 default for creating kNN aggregates? #1518

Uh oh!

Uh oh!

RegnerM2015 Jul 22, 2022

Replies: 1 comment · 1 reply

Uh oh!

rcorces Jul 23, 2022 Maintainer

Uh oh!

RegnerM2015 Jul 29, 2022 Author

RegnerM2015
Jul 22, 2022

Replies: 1 comment 1 reply

rcorces
Jul 23, 2022
Maintainer

RegnerM2015 Jul 29, 2022
Author