Why is k=100 default for creating kNN aggregates? #1518
Unanswered
RegnerM2015
asked this question in
Questions / Documentation
Replies: 1 comment 1 reply
-
I do not believe that we ever benchmarked this parameter. But I also dont believe that benchmarking it is super informative for broad application. The specific value for k is largely dependent on your dataset size and the size of your clusters. I believe that higher k with small clusters would lead to cross-cluster inclusion in the low-overlapping cell aggregates. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi @rcorces and @jeffmgranja,
I am trying to gain more understanding and intuition regarding the creation of low overlapping aggregates or metacells in ArchR procedures such as
addPeak2GeneLinks()
. The default size of these aggregates or groups is 100 cells:Do you have any ideas on how the downstream results may be affected if k were increased or decreased? Perhaps a lower k may lead to more spurious correlations, while a higher k would lead to more informative aggregates (albeit, with less power as there would be less aggregates to test the correlations on). Did you every benchmark this parameter to find out how it affects the number of significant P2Gs identified?
Thank you for your help! I look forward to gaining a deeper understanding of this process.
Beta Was this translation helpful? Give feedback.
All reactions