-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Description
Dear developers,
Thank you very much for developing RAFT and for the clear and insightful paper.
I am still relatively new to genome assembly, and I am currently assembling two haplotype-resolved human genomes. I would like to ask whether RAFT would be beneficial in my case.
Data Description:
- ~45× ONT R10.4.1 simplex reads (basecalled & corrected with Dorado, filtered by length > 10 kb using seqkit), N50 ≈ 54K
- ~20× Hi-C reads (for scaffolding)
- ~75x Illumina WGS reads (for polishing)
Assembly workflow:
HiFiasm (assembly) -> HapHic (scaffolding) -> Dorado polish + NextPolish2 (polishing)
From your paper, I understand that:
- Contained-read removal can be more problematic for ONT simplex data than for HiFi.
- Higher coverage helps reduce the risk of gaps caused by contained reads.
- RAFT mitigates such gaps by fragmenting reads to a more uniform length.
My questions are:
- Given that I already have ~45× Dorado-corrected ONT simplex reads, would you recommend running RAFT before HiFiasm?
- I plan to increase coverage from 45× to at least 80×. In this case, would RAFT still be recommended before HiFiasm?
- Are there specific coverage thresholds or data characteristics where RAFT provides the most benefit (e.g., >30× simplex, UL vs duplex, etc.)?
- If I use RAFT, should I keep the default parameters (20 kb fragment length, µ=1.5), or do you suggest tuning them for human genome + simplex data?
Any guidance would be very helpful, especially for deciding whether RAFT should be integrated into my standard pipeline.
Thank you again for your work and for making RAFT available to the community!
Best Regards,
Kuan Chiun
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels