Skip to content

Question: Contig kmer counts instead of mapping?? #405

@shiraz-shah

Description

@shiraz-shah

What are your thoughts on profiling contigs without Burrows-Wheeler style mapping? And is it possible VAMB could work with such alternative sources of counts? I ask because mapping is the main bottleneck currently.

If reads were simply split into long enough kmers, like the sizes typically used for assembly, they should yield enough specificity for unique mapping. Right? Or no? Although sequencing errors would interfere with string searches, they are rare, especially if reads are qc'd.

Default bwa mem is super unspecific for metagenomic profiling, and should be filtered with something like msamtools, which removes half of the read mappings, before generating e.g. an OTU table.

VAMB is magically able to see past bwa's noise and still define biologically meaningful clusters. Should it not be able to do the same with kmer counts?

Metadata

Metadata

Assignees

No one assigned

    Labels

    investigationDiscussion about metagenomics or binning

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions