Skip to content

Releases: pdimens/harpy

3.2

03 Feb 20:02

Choose a tag to compare

New

  • --choose option for harpy view log

Fixed

  • removed custom snakemake logging and updated harpy view log to point to .snakemake/log (fixes #269)
  • all commands now have --help option (fixes #268)

Internal

  • conda environments now have a HarpyEnvs class that unified conda/pixi/container things
  • harpy containers now use smaller pixi environments, one per corresponding conda env
    • resulting in more containers that are significantly smaller, but still version-tagged (e.g. pdimens/harpy:qc_3.2)

Breaking Changes

None

What's Changed

Full Changelog: 3.1...3.2

3.1

16 Oct 18:08

Choose a tag to compare

3.1

deprecations

  • harpy convert
  • harpy downsample
  • harpy simulate linkedreads

changes

  • simplified the rich-click theming
  • bwa-mem2 replaces bwa in align bwa
  • impute needs a minimum of 5 biallelic snps per contig
  • alignment during metassembly no longer outputs unmapped reads or alignments with mapq < 10
  • added hidden --force option to metassembly to force athena to run even if fq/bam don't pass its internal qc
  • options error borders are now yellow, making it consistent with other errors

fixes

  • impute workflow has explicit output plot filename declarations to catch errors better
  • github action that builds the container

In progress but not ready yet

  • replace manually hacked progressbars with executor plugin

3.0.1

29 Aug 18:21

Choose a tag to compare

New version, new problems

Fixes

  • using --container properly sets the snakemake --sdm flag in config.yaml as a list and not two words
    • apparently that was silently deprecated by snakemake, thanks to UNB for catching it
  • diagnose has a minor printing fix and uses --sdm env-modules to ignore any conda/apptainer nonsense

3.0

28 Aug 16:15

Choose a tag to compare

3.0

The big 3-0 release! It's actually not that big, but there are some breaking changes due to the newest feature:

automatic linked-read type detection

This change now makes all kinds of linked reads (haplotagging, tellseq, stlfr) and non-linked reads first-class citizens so you don't have to specify anything. However, you can forcibly ignore linked-read information using --unlinked, which takes the place of what was formerly --ignore-bx and --lr-type none. Autodetection is done by scanning up to the first 100 records of the first up to 5 input files. It stops searching immediately after a format is recognized. Formats cannot be mixed and matched across input files

New

  • automatic linked read (or not linked-read) detection
    • this also means it will yell at you if your data isnt formatted correctly
  • internal/hidden commands to locally install dependencies have been exposed as harpy deps
    • this feature is for HPCs with strict/challenging node configurations (e.g. no internet access on worker nodes)
    • deps conda will install all or the specified conda depenencies into .environments
    • deps container will pull the harpy dependency container from dockerhub and convert it into an Apptainer .sif in .environments

Breaking

  • --lr-type and --ignore-bx removed and replaced with --unlinked

Fixes

  • convert fastq no longer double-prints the 2:N:0:ATAGGA reverse-read string

2.8.1

25 Aug 20:33
cd3a05e

Choose a tag to compare

When tagging this release, I forgot there was a breaking change with removing deconvolution from harpy qc, so this is technically not a bugfix version as the version number change would imply. So, apologies for that oversight 🙏.

New

  • a check for ARM-based systems to give an adequate error message regarding deconvolution

Breaking

  • can no longer deconvolve in the qc command
    • rationale: deconvolution is an involved and iterative process, it's more harm than good to lump it in here

Internal

  • new is_arm() function to check if a system has ARM architecture and error if it's disallowed
  • quickdeconvolution split into its own conda env (used to be in qc env)
  • reorganized harpy's internals because common/misc.py was becoming too cumbersome
  • common.misc is now common.file_ops, common.system_ops, and common.progress

2.8

22 Aug 18:41

Choose a tag to compare

2.8

Breaking

See the New section below for explanations regarding these breaking changes

  • convert bam no longer exists
  • convert fastq no longer accepts standard as input or output
  • convert fastq no longer accepts FROM positional argument
  • convert standardize no longer exists

New

  • convert standardize-fastq added to convert FASTQs into standard format with option barcode --style conversion
  • convert standardize-bam added, which absorbs the functionality of now-deprecated convert bam and convert standardize
    • can also do barcode --style conversion
  • convert ... methods now perform linked-read type autodetection

Internal

  • FASTQFile and SAMFile data classes now accept single: bool so they can be easily used for workflows that need only a single file

PRs

Full Changelog: 2.7...2.8

2.7

18 Aug 18:49
29d04c9

Choose a tag to compare

2.7

Breaking Changes

  • instances where there are both --platform options and --ignore-bx have been consolidated into -L/--lr-type where --lr-type none disables linked-read things (when available)
    • metassembly keeps --ignore-bx
    • align bwa/strobe now accepts -L none instead of --ignore-bx to make things consistent
  • apptainer was removed as a dependency due to conda-build not respecting the declaration that it should only try to install it on linux systems (it's a known bug). This created issues where one could not install Harpy on a macOS. So, it's been removed for now
    • added a check to see if apptainer is available on the system when invoking the --container option, which will inform you if it's not

New

  • the error line or success line Harpy prints to terminal now includes a date and time
  • qc now has a -L/--lr-type option so the barcode stats can be calculated properly for different linked-read types
    • the subsequent barcode report was updated to match the new flexible logic
    • barcode report from qc includes more information on different data formats

Fixes

  • fixes to the output logging again to prevent stalls/infinite loops

Internal

  • Shell commands should be printed better and fancier now in errors, so should logs.
  • quarto logs are formatted to reduce aggressive newlines
  • some code cleanup and renaming

What's Changed

Full Changelog: 2.6.1...2.7

2.6.1

12 Aug 18:49

Choose a tag to compare

New

  • nothing that is user-facing
    • although input parameters now do some filetype checks at the argument-parsing step, so some error messages may come sooner or look a little different (ideally you're not seeing any error messages 😅 )

Fixes

  • fixed issue in malformed output of convert bam
  • standardized the use of the stderr rich.Console across things that would use it

Internal

  • adds filetype validations to the command line interface such that it does cursory formatting checks for FASTQ/A, BAM/SAM, VCF/BCF
  • common/parsers.py and common/validations.py have been removed in favor of validation classes in harpy.validation module
    • new classes FASTQ, FASTA, SAM, VCF, ImputeParams, Populations
    • this was actually a pretty significant overhaul
  • where possible, replace subprocess.xxx("bcftools....") with its pysam.bcftools.xxx equivalent and clean up all the excess string processing made redundant by the change

PRs

Full Changelog: 2.6.0...2.6.1

2.6.0

07 Aug 19:16

Choose a tag to compare

Breaking

  • align ema has been removed due to this catastrophic issue
  • utility scripts provided as python entrypoints, so they are now called from the command line without their .py suffix
  • workflows that have a --platform option now use -P (uppercase) as the short-flag
  • phase workflow config has minor restructuring
  • phase --prune-threshold default is now 30 (up from 7)

New

  • the shell part of error logs now prints quite nicely
  • if there is an error due to conda environment creation by snakemake, does some checks and prints more informative output
  • guardrails when trying to use --container on a macOS system
  • phase now adds these options:
    • --platform/-P : linked-read barcode type (for filtering out invalids)
    • --min-map-qual: minimum alignment MQ score to consider alignment for phasing
    • --min-base-qual: minimum base quality score to consider for haplotype fragment inclusion

Fixes

  • reworks the phase workflow such that it fixes #247,
    • simplifies some things
    • filters out invalid barcodes from extractHAIRS
  • align now includes the sanitation flag in samtools fixmate,
    • samtools filtering step moved to mark duplicates process to avoid remove-a-mate errors
  • phase --prune-threshold no longer divides by 100 to create a percentage

Internal

  • the /bin directory has also been renamed /scripts
  • all harpy commands (e.g. qc) have been moved to harpy/commands to make navigating the project easier

PRs

Full Changelog: 2.5.0...2.6.0

2.5.0

30 Jul 04:09

Choose a tag to compare

This release is mostly an internal refactor/improvement. The only noticeably user-facing change is that harpy template hpc-* no longer requires conda on your system and will work as intended regardless if it's there or not.

Internal

  • workflows now use a Workflow class to simplify and standardize workflow processes (means less dev overhead and repetition)

Fixes

  • Corrected tooltips and typos in imputation reports
  • template hpc-* uses the conda and pip APIs (rather than subprocess) to assess if executor plugins are installed
    • no longer creates errors for non conda-based installations

New Features

  • the notice text that explains how to install a given snakemake executor plugin tries to guess whether you'd be installing it via conda, pip, or pixi

Breaking Changes

  • None