-
Notifications
You must be signed in to change notification settings - Fork 112
Description
New issue checks
- I have read the Dorado Documentation.
- I did not find an existing issue.
Dorado version
1.3.1
Dorado subcommand
Polish
The issue
Hello,
I'm trying to get dorado polish working properly. Having given up on gpu (likely a nextflow submission problem) and slurm (out of control RAM use) I am now trying to debug the polishing RAM requirements on a local server. I've read similary issues like #1524
Server specs - 64 cores, 768 GB RAM.
I have had over 30 attempts with various combinations of threads (8-24), batch sizes (0, 128) and infer-threads (2-24) with no success. This is for a medium sized 800 MB plant genome. With medaka I polish genomes of this size and larger regularly.
Result is always oom kill / out of RAM.
Command:
[2026-01-22 12:10:30.458] [info] Running: "polish" "-x" "cpu" "--batchsize" "128" "--ignore-read-groups" "--threads" "8" "--infer-threads" "2" "dorado_align_bams.bam" "contigs-unpolished.fasta"
Surely this server should be big enough to handle dorado polish ? No other jobs are running. I have bigger servers but people would be unhappy to see me using 2 TB of RAM but only 8 cores. Or does the RAM not increase with more cores ?
Options going forward
- drastically reduce coverage (perhaps 20-30x is sufficient, I often have 50-80x)
- use medaka instead
- optimize settings?
Thanks
Colin
System specifications
linux, nextflow, slurm or local