Skip to content

Releases: ACEnglish/kanpig

v2.0.2

30 Jan 14:41

Choose a tag to compare

Minor patch to fix PS tags when haplotagged reads aren't provided.

Full Changelog: v2.0.1...v2.0.2

v2.0.1

27 Jan 18:11

Choose a tag to compare

Minor patch of the dependencies so that bioconda can build/distribute.

Full Changelog: v2.0.0...v2.0.1

v2.0.0

26 Jan 17:36

Choose a tag to compare

Expanded joint genotyping capabilities and enhanced flexibility for structural variant analysis.

New Features

Joint Genotyping Modes

  • Added kanpig trio command for joint genotyping of structural variants in parent-child trios
  • Added kanpig mosaic command for joint genotyping of low variant allele frequency (VAF) structural variants

Enhanced Input/Output Options

  • BAM/CRAM files can now be read directly from remote URLs
  • Symbolic deletion notation (<DEL>) is now supported
  • New optional --rnames flag outputs SVID and supporting read names (requires BAM --reads input)
  • Optional genotype quality (GQ) calibration now available (see gqcalibration/README.md for details)

Python API

Python-Rust bindings now expose plup parsing, kmer vectorization, and genotyper math functionality for custom scripting

Improvements

  • Refined mathematical calculations for more informative genotype quality scores and improved genotype calls
  • VCF outputs now include a comment line with kanpig version and command information for better reproducibility
  • Significant code refactoring to improve reusability and facilitate creation of new sub-commands
  • Established functional testing infrastructure in repo_utils/

Changes

  • Removed FORMAT/NE field from output. When phase set (PS) tags are not available from BAM/plup files, the neighborhood ID for short-range phase groups now defaults to the 1-based start position of the upstream-most SV in the neighborhood

Full Updates: https://github.com/ACEnglish/kanpig/wiki/Updates#v200
Full Changelog: v1.1.0...v2.0.0

v1.1.0

04 Apr 18:34

Choose a tag to compare

  • New --ab allele balance threshold lowers frequency of assigning erroneous compound heterozygous genotypes
  • Experimental --squish changes gpenalty behavior to preferring simpler paths
  • Fix score's gpenalty
  • FORMAT/NE now uses neighborhood's start position for consistent labeling across samples

Full Changelog: v1.0.2...v1.1.0

v1.0.2

19 Jan 19:37

Choose a tag to compare

Minor patch

  • UI tweaks, including cleaner gt parameters
  • Slight htslib bam record and other minor efficiency improvement
  • Fix bug arising from ploidy==1 regions sometimes.
  • Fix minor bugs around PS and HP tag counting/record keeping

Full Changelog: v1.0.1...v1.0.2

v1.0.1

10 Dec 16:47

Choose a tag to compare

Faster, Smaller, More Accurate Genotyping

Kanpig now has sub-commands gt and plup. The new plup command will extract reads and their SVs from a bam file into a small file that's useful for long-term storage of reads.

  • gt improvements:
    • can parse plup files much more quickly (up to 8x) than parsing a bam. Though bam parsing is also now ~2x faster.
    • parse PS and HP information from haplotagged reads to increase genotyping accuracy as well as record long-range phasing information in the output VCF.
    • now uses kmedoid clustering instead of kmeans, resulting in a modest improvement to genotyping accuracy.

Full Updates: https://github.com/ACEnglish/kanpig/wiki/Updates#v101
Full Changelog: v0.3.1...v1.0.1

v0.3.1

25 Jun 15:00

Choose a tag to compare

Consistency improvements

  • New filtering of haplotypes without paths increases accuracy
  • New path scoring improves accuracy and consistency
  • ZS and SS FORMAT fields replaced by KS reporting the score
  • Requiring reads to span the full variant graph window including --chunksize buffer increases accuracy
  • Exhaustive search of partial haplotypes
  • Slight runtime reduction from avoidance of redundant path searches

Full Changelog: v0.3.0...v0.3.1

v0.3.0

11 Jun 20:06

Choose a tag to compare

General improvements

  • ~8% speed increase from less work in the path-searching
  • Partial haplotypes bug fix increases accuracy
  • Fixed SQ and FT fields
  • Dedicated writing thread helps reduce memory usage by preventing a backlog of completed variants while reading
  • Default --out is stdout to allow easier compression/indexing (e.g. kanpig .. | bcftools sort -O z -o out.vcf.gz)
  • IUPAC codes are fixed by kanpig according to vcf specifications (Issue #1)
  • Fixed filtering of symbolic alts and BNDs
  • Argument validation

Full Changelog: v0.2.0...v0.3.0

v0.2.0

21 May 21:24

Choose a tag to compare

  • Up to 40% reduction in runtime
  • Hemizygous and sex chromosome aware genotyping with new --ploidy-bed
  • Variants with alternate alleles of stars, monozygotic reference, and BNDs are filtered out
  • Comparing PathScores by their average size and sequence similarity increases accuracy

Full Changelog: v0.1.2...v0.2.0

v0.1.2

05 May 05:35

Choose a tag to compare

  • New optional hompolymer filter doesn't kmerize long homopolymers
  • Improved logging info
  • Correcting GQ field
  • Correcting kmer counting
  • Small speed/memory/io improvements
    • Off-loaded annotation work from the single writer thread to the worker threads and using a large
      multiple of page size for the BufWriter capacity
    • Fewer bam file opens
    • Fewer clone operations

Full Changelog: v0.1.0...v0.1.2