This repository was archived by the owner on Jul 1, 2025. It is now read-only.
Speeding up comparisons for *very* large jobs on HPC #297
widdowquinn
started this conversation in
Ideas
Replies: 2 comments 3 replies
-
I don't follow this.
Does it have to be a list? We only care about order in terms of dependencies, which are not built-in to the joblist itself (unless I'm misreading the code). Sets should be faster, and will implicitly prevent any duplicates. May not be the only optimisation worth making, but it might help. |
Beta Was this translation helpful? Give feedback.
3 replies
-
|
This has a proposed fix around garbage collection in #306 |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Even the task of compiling command-lines to run can be slow with large enough inputs. Currently, the process of compiling command lines is serial. We might get some speed-up if we used a different approach. I note:
I think this gets us two speed-ups:
I'm currently hitting this issue with a 2.5k genome job on a SLURM cluster - just generating the job list currently takes hours.
Beta Was this translation helpful? Give feedback.
All reactions