Releases: pdimens/harpy
3.2
New
--chooseoption forharpy view log
Fixed
- removed custom snakemake logging and updated
harpy view logto point to.snakemake/log(fixes #269) - all commands now have
--helpoption (fixes #268)
Internal
- conda environments now have a
HarpyEnvsclass that unified conda/pixi/container things - harpy containers now use smaller pixi environments, one per corresponding conda env
- resulting in more containers that are significantly smaller, but still version-tagged (e.g.
pdimens/harpy:qc_3.2)
- resulting in more containers that are significantly smaller, but still version-tagged (e.g.
Breaking Changes
None
What's Changed
Full Changelog: 3.1...3.2
3.1
deprecations
- harpy convert
- harpy downsample
- harpy simulate linkedreads
changes
- simplified the rich-click theming
- bwa-mem2 replaces bwa in align bwa
- impute needs a minimum of 5 biallelic snps per contig
- alignment during metassembly no longer outputs unmapped reads or alignments with mapq < 10
- added hidden
--forceoption to metassembly to force athena to run even if fq/bam don't pass its internal qc - options error borders are now yellow, making it consistent with other errors
fixes
- impute workflow has explicit output plot filename declarations to catch errors better
- github action that builds the container
In progress but not ready yet
- replace manually hacked progressbars with executor plugin
3.0.1
New version, new problems
Fixes
- using
--containerproperly sets the snakemake--sdmflag inconfig.yamlas a list and not two words- apparently that was silently deprecated by snakemake, thanks to UNB for catching it
diagnosehas a minor printing fix and uses--sdm env-modulesto ignore any conda/apptainer nonsense
3.0
The big 3-0 release! It's actually not that big, but there are some breaking changes due to the newest feature:
automatic linked-read type detection
This change now makes all kinds of linked reads (haplotagging, tellseq, stlfr) and non-linked reads first-class citizens so you don't have to specify anything. However, you can forcibly ignore linked-read information using --unlinked, which takes the place of what was formerly --ignore-bx and --lr-type none. Autodetection is done by scanning up to the first 100 records of the first up to 5 input files. It stops searching immediately after a format is recognized. Formats cannot be mixed and matched across input files
New
- automatic linked read (or not linked-read) detection
- this also means it will yell at you if your data isnt formatted correctly
- internal/hidden commands to locally install dependencies have been exposed as
harpy deps- this feature is for HPCs with strict/challenging node configurations (e.g. no internet access on worker nodes)
deps condawill install all or the specified conda depenencies into.environmentsdeps containerwill pull the harpy dependency container from dockerhub and convert it into an Apptainer .sif in.environments
Breaking
--lr-typeand--ignore-bxremoved and replaced with--unlinked
Fixes
convert fastqno longer double-prints the2:N:0:ATAGGAreverse-read string
2.8.1
When tagging this release, I forgot there was a breaking change with removing deconvolution from harpy qc, so this is technically not a bugfix version as the version number change would imply. So, apologies for that oversight 🙏.
New
- a check for ARM-based systems to give an adequate error message regarding deconvolution
Breaking
- can no longer deconvolve in the
qccommand- rationale: deconvolution is an involved and iterative process, it's more harm than good to lump it in here
Internal
- new is_arm() function to check if a system has ARM architecture and error if it's disallowed
quickdeconvolutionsplit into its own conda env (used to be inqcenv)- reorganized harpy's internals because
common/misc.pywas becoming too cumbersome common.miscis nowcommon.file_ops,common.system_ops, andcommon.progress
2.8
Breaking
See the New section below for explanations regarding these breaking changes
convert bamno longer existsconvert fastqno longer acceptsstandardas input or outputconvert fastqno longer acceptsFROMpositional argumentconvert standardizeno longer exists
New
convert standardize-fastqadded to convert FASTQs into standard format with option barcode--styleconversionconvert standardize-bamadded, which absorbs the functionality of now-deprecatedconvert bamandconvert standardize- can also do barcode
--styleconversion
- can also do barcode
convert ...methods now perform linked-read type autodetection
Internal
FASTQFileandSAMFiledata classes now acceptsingle: boolso they can be easily used for workflows that need only a single file
PRs
Full Changelog: 2.7...2.8
2.7
Breaking Changes
- instances where there are both
--platformoptions and--ignore-bxhave been consolidated into-L/--lr-typewhere--lr-type nonedisables linked-read things (when available)metassemblykeeps--ignore-bxalign bwa/strobenow accepts-L noneinstead of--ignore-bxto make things consistent
apptainerwas removed as a dependency due to conda-build not respecting the declaration that it should only try to install it on linux systems (it's a known bug). This created issues where one could not install Harpy on a macOS. So, it's been removed for now- added a check to see if
apptaineris available on the system when invoking the--containeroption, which will inform you if it's not
- added a check to see if
New
- the error line or success line Harpy prints to terminal now includes a date and time
qcnow has a-L/--lr-typeoption so the barcode stats can be calculated properly for different linked-read types- the subsequent barcode report was updated to match the new flexible logic
- barcode report from
qcincludes more information on different data formats
Fixes
- fixes to the output logging again to prevent stalls/infinite loops
Internal
- Shell commands should be printed better and fancier now in errors, so should logs.
- quarto logs are formatted to reduce aggressive newlines
- some code cleanup and renaming
What's Changed
- consolidate print_error and print_solution* by @pdimens in #250
- diversify barcodes by @pdimens in #251
Full Changelog: 2.6.1...2.7
2.6.1
New
- nothing that is user-facing
- although input parameters now do some filetype checks at the argument-parsing step, so some error messages may come sooner or look a little different (ideally you're not seeing any error messages 😅 )
Fixes
- fixed issue in malformed output of
convert bam - standardized the use of the stderr rich.Console across things that would use it
Internal
- adds filetype validations to the command line interface such that it does cursory formatting checks for FASTQ/A, BAM/SAM, VCF/BCF
common/parsers.pyandcommon/validations.pyhave been removed in favor of validation classes inharpy.validationmodule- new classes
FASTQ,FASTA,SAM,VCF,ImputeParams,Populations - this was actually a pretty significant overhaul
- new classes
- where possible, replace
subprocess.xxx("bcftools....")with itspysam.bcftools.xxxequivalent and clean up all the excess string processing made redundant by the change
PRs
Full Changelog: 2.6.0...2.6.1
2.6.0
Breaking
align emahas been removed due to this catastrophic issue- utility scripts provided as python entrypoints, so they are now called from the command line without their
.pysuffix - workflows that have a
--platformoption now use-P(uppercase) as the short-flag phaseworkflow config has minor restructuringphase --prune-thresholddefault is now30(up from7)
New
- the
shellpart of error logs now prints quite nicely - if there is an error due to conda environment creation by snakemake, does some checks and prints more informative output
- guardrails when trying to use
--containeron a macOS system phasenow adds these options:--platform/-P: linked-read barcode type (for filtering out invalids)--min-map-qual: minimum alignment MQ score to consider alignment for phasing--min-base-qual: minimum base quality score to consider for haplotype fragment inclusion
Fixes
- reworks the
phaseworkflow such that it fixes #247,- simplifies some things
- filters out invalid barcodes from extractHAIRS
alignnow includes the sanitation flag insamtools fixmate,- samtools filtering step moved to mark duplicates process to avoid remove-a-mate errors
phase --prune-thresholdno longer divides by 100 to create a percentage
Internal
- the
/bindirectory has also been renamed/scripts - all harpy commands (e.g.
qc) have been moved toharpy/commandsto make navigating the project easier
PRs
- sanitize align by @pdimens in #245
- swap manual script installs for script entrypoints by @pdimens in #246
Full Changelog: 2.5.0...2.6.0
2.5.0
This release is mostly an internal refactor/improvement. The only noticeably user-facing change is that harpy template hpc-* no longer requires conda on your system and will work as intended regardless if it's there or not.
Internal
- workflows now use a
Workflowclass to simplify and standardize workflow processes (means less dev overhead and repetition)
Fixes
- Corrected tooltips and typos in imputation reports
template hpc-*uses the conda and pip APIs (rather thansubprocess) to assess if executor plugins are installed- no longer creates errors for non conda-based installations
New Features
- the notice text that explains how to install a given snakemake executor plugin tries to guess whether you'd be installing it via conda, pip, or pixi
Breaking Changes
- None