Replies: 1 comment
-
This is an error in the documentation that is slated to be fixed in the next release. Iterative LSI is actually deterministic. You can confirm this for yourself:
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hello, thank you for the great software + documentation!
I wanted to ask about the non-deterministic nature of the iterative LSI (https://www.archrproject.com/bookdown/iterative-latent-semantic-indexing-lsi.html). Although the changes are small when I keep all parameters the same, the randomness introduced here means that downstream analysis is not reproducible. We (and I imagine other groups) are very keen to create analysis pipelines that are fully reproducible, and this seems to not be possible with this way of running dimensionality reduction. The problem is perhaps even more prominent because dim reduction is one of the first steps in the workflow.
My understanding is that TF-IDF normalisation, SVD dim reduction and graph-based clustering processes are not inherently non-deterministic (I think Signac runs these steps in a deterministic way). Is the process that introduces randomness to the iterative LSI the subsampling of cells, or is it another step? Is there any way to circumvent or fix this randomness to produce reproducible results?
Iterative LSI is really fantastic and my UMAP visualisations look much better using ArchR's implementation as opposed to Signac's dimensionality reduction, so I am really keen to understand how it works and try to implement it in a reproducible workflow.
I appreciate appreciate any insights, thank you!
Beta Was this translation helpful? Give feedback.
All reactions