Skip to content

Zero usable reads when one of the input files has short reads #15

@Redmar-van-den-Berg

Description

@Redmar-van-den-Berg

Describe the bug
The number of bases to take from each file is calculated independently of whether there are enough bases available in each read. If there are not, the read is appended with N characters. Since words with N in them are skipped by HUMID, this gives zero usable reads.

To Reproduce
R1-no-umi_dedup.fastq.gz
R2_dedup.fastq.gz
R3_dedup.fastq.gz

Steps to reproduce the behavior:

$ humid -s R1-no-umi.fastq.gz R2.fastq.gz R3.fastq.gz

Expected behavior
Since the files only contain a single read, there should also be a single read in the deduplicated output. Instead, we get no output at all, and the stats.dat file shows there were 0 usable reads.

Desktop (please complete the following information):

  • Using HUMID 1.0.2

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions