Skip to content

Questions on applying RAFT to human genome assembly with corrected ONT simplex reads #4

@kuanchiun

Description

@kuanchiun

Dear developers,

Thank you very much for developing RAFT and for the clear and insightful paper.
I am still relatively new to genome assembly, and I am currently assembling two haplotype-resolved human genomes. I would like to ask whether RAFT would be beneficial in my case.

Data Description:

  • ~45× ONT R10.4.1 simplex reads (basecalled & corrected with Dorado, filtered by length > 10 kb using seqkit), N50 ≈ 54K
  • ~20× Hi-C reads (for scaffolding)
  • ~75x Illumina WGS reads (for polishing)

Assembly workflow:
HiFiasm (assembly) -> HapHic (scaffolding) -> Dorado polish + NextPolish2 (polishing)

From your paper, I understand that:

  • Contained-read removal can be more problematic for ONT simplex data than for HiFi.
  • Higher coverage helps reduce the risk of gaps caused by contained reads.
  • RAFT mitigates such gaps by fragmenting reads to a more uniform length.

My questions are:

  1. Given that I already have ~45× Dorado-corrected ONT simplex reads, would you recommend running RAFT before HiFiasm?
  2. I plan to increase coverage from 45× to at least 80×. In this case, would RAFT still be recommended before HiFiasm?
  3. Are there specific coverage thresholds or data characteristics where RAFT provides the most benefit (e.g., >30× simplex, UL vs duplex, etc.)?
  4. If I use RAFT, should I keep the default parameters (20 kb fragment length, µ=1.5), or do you suggest tuning them for human genome + simplex data?

Any guidance would be very helpful, especially for deciding whether RAFT should be integrated into my standard pipeline.
Thank you again for your work and for making RAFT available to the community!

Best Regards,
Kuan Chiun

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions