Questions on applying RAFT to human genome assembly with corrected ONT simplex reads

Dear developers,

Thank you very much for developing RAFT and for the clear and insightful paper. 
I am still relatively new to genome assembly, and I am currently assembling two haplotype-resolved human genomes. I would like to ask whether RAFT would be beneficial in my case.

Data Description:
- ~45× ONT R10.4.1 simplex reads (basecalled & corrected with Dorado, filtered by length > 10 kb using seqkit), N50 ≈ 54K
- ~20× Hi-C reads (for scaffolding)
- ~75x Illumina WGS reads (for polishing)

Assembly workflow:
HiFiasm (assembly) -> HapHic (scaffolding) -> Dorado polish + NextPolish2 (polishing)

From your paper, I understand that:
- Contained-read removal can be more problematic for ONT simplex data than for HiFi.
- Higher coverage helps reduce the risk of gaps caused by contained reads.
- RAFT mitigates such gaps by fragmenting reads to a more uniform length.

My questions are:
1. Given that I already have ~45× Dorado-corrected ONT simplex reads, would you recommend running RAFT before HiFiasm?
2. I plan to increase coverage from 45× to at least 80×. In this case, would RAFT still be recommended before HiFiasm?
3. Are there specific coverage thresholds or data characteristics where RAFT provides the most benefit (e.g., >30× simplex, UL vs duplex, etc.)?
4. If I use RAFT, should I keep the default parameters (20 kb fragment length, µ=1.5), or do you suggest tuning them for human genome + simplex data?

Any guidance would be very helpful, especially for deciding whether RAFT should be integrated into my standard pipeline.
Thank you again for your work and for making RAFT available to the community!

Best Regards,
Kuan Chiun

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Questions on applying RAFT to human genome assembly with corrected ONT simplex reads #4

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Questions on applying RAFT to human genome assembly with corrected ONT simplex reads #4

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions