Skip to content

Produce name-sorted CRAM #59

@vinjana

Description

@vinjana

Do not discard the alignment information and produce a name-sorted CRAM. The alignment information will allow for high reference-based compression.

The alignment and the sequence information anyway is separated in the SAM format (CIGAR strings + position + strand information vs. separate sequence column).

Secondary and supplementary alignment should be discarded (optionally), to have better compression.

SAM attributes might also be discarded, to safe more space.

The result would be just sequences and alignments in order to maximize compression level.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    To Do

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions