Skip to content

Running past the end of a BED #8

@rulixxx

Description

@rulixxx

Expected Behavior

Terminate correctly when iterating over a BED file for intersecting intervals.

Current Behavior

Error caused by trying to read pass the end of the stream.

Possible Solution / Implementation

This worked for me:

Added an extra condition in the loop of function Indexes.done

function Indexes.done(iter::Indexes.TabixOverlapIterator, state)
    buffer = BioGenerics.IO.stream(iter.reader)
    source = buffer.stream
    if state.chunkid == 0
        if isempty(state.chunks)
            return true
        end
        state.chunkid += 1
        seek(source, state.chunks[state.chunkid].start)
    end
    while state.chunkid ≤ lastindex(state.chunks)
        chunk = state.chunks[state.chunkid]
        # The `virtualoffset(source)` is not synchronized with the current reading position because data are buffered in `buffer` for parsing text.
        # So we need to check not only `virtualoffset` but also `nb_available`, which returns the current buffered data size.
        while !eof(iter.reader.state.stream) && (bytesavailable(buffer) > 0 || BGZFStreams.virtualoffset(source) < chunk.stop)
            read!(iter.reader, state.record)
            c = Indexes.icmp(state.record, iter.interval)
            if c == 0  # overlapping
                return false
            elseif c > 0
                # no more overlapping records in this chunk
                break
            end
        end
        state.chunkid += 1
        if state.chunkid ≤ lastindex(state.chunks)
            seek(source, state.chunks[state.chunkid].start)
        end
    end
    # no more overlapping records
    return true
end

Steps to Reproduce (for bugs)

Sorry I encountered this sometime ago so I no longer have the BED files. Might have been brought about when working with concatenated bgzipped files.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions