- major changes to the command-line API
- the new API now uses a subcommand system:
pyani <cmd> <arguments>
- the new API now uses a subcommand system:
- major changes to result storage
- results are now stored in an SQLite database, rather than reported to
.tabfiles - enables reuse of previously calculated results in new analyses
- enables generation of multiple output files from the same analysis, after the analysis is complete
- results are now stored in an SQLite database, rather than reported to
- more output formats
- tabular output is available as HTML tables, as well as plain text
- new documentation
- documentation is now available at ReadTheDocs
- papers citing or referring to
pyaniare now listed
pyani download(replacinggenbank_get_genomes_by_taxon.py) allows use of an NCBI API key for faster/more stable downloadsgenbank_get_genomes_by_taxon.pylabel/class file output updated to include a hash, matching thepyani downloadoutput format
pyani plotnow produces distribution plots in addition to heatmaps- A number of bug fixes were implemented, including:
- consistent handling of filenames as
Paths - alignment length calculation for ANIm was corrected
- consistent handling of filenames as
- major refactoring of code ported from v0.2
- static typing implemented
- testing converted to
pytestconventions fromunittest
- update legacy BLAST download location in TravisCI
- update concordance tests (issue #105)
- extend test suites (issue #104)
- modify ANIm concordance test to accommodate new command structure
- add
delta-filterwrapper for compatibility with SGE/OGE schedulers
- Fix for issue #97 where numeric arguments to the GenBank download script were not recognised
- GenBank download script now insists on integer input for
--batchsize,--retries, and--timeout - Added
setup.cfgthat points to README.md - Fix issue #97 where valid input arguments were not recognised in the download script
- Add Dockerfiles for making Docker images
ANImnow usesdelta-filterto remove alignments of repeat regions (issue #91)- added
--filter_exeoption to specify location ofdelta-filterutility (issue #91) - fixed
--formatoption so that GenBank downloads work again (issue #89) - add
--SGEargsoption toaverage_nucleotide_identity.pyfor custom qsub settings README.mdbadges now clickable--versionswitch added toaverage_nucleotide_identity.py- FTP timeouts are now caught differently in
genbank_get_genomes_by_taxon.py - Additional characters in NCBI FTP URIs now escaped in
genbank_get_genomes_by_taxon.py- should be fewer failed downloads - Modified error messaging when
NUCmeralignment fails average_nucleotide_identity.pyargument documentation improvements- Script now fails immediately if label or class files missing (issue #78)
- Changes to
--noclobberlog behaviour (issue #79) - fixed
--rerendercode (issue #85)
- fixes a bug in the installed scripts where the shebang (
#!) in wheel and egg packages pointed to a development Python
- fix for issue #53 (--maxmatch has no effect)
- fix to
genbank_get_genomes_by_taxon.pyto account for NCBI FTP location changes - fixed issue #52 (local variable bug)
- fixed issued #49 (TETRA failure) and #51 (matplotlib bug)
- add several tests and support for
codecov.io,landscape.ioandTravis-CI - removed requirement for
rpy2 - moved scripts to
bin/subdirectory
pyaninow requiresrpy2v2.8.0 in order to satisfy running under Anaconda (see issue #26)pyaninow checks for presence ofrpy2and - when run from source - ifrpy2is not available,pyanidoesn't throw an error until R graphical output is requested. If installed -via-pip, thenpyanistill raisespkg_resources.DistributionNotFoundifrpy2is missing.- Updated
genbank_get_genomes_by_taxon.pyscript to use the new FTP locations at NCBI for each assembly. - Fixed bug where
ANIbwould not go to completion if empty BLASTN files were generated (see issue #27) - Fixed bug where
ANImwould not finish undermultiprocessingif input sequences were highly divergent. - Added Hadamard product of percentage identity and alignment coverage as output.
- Fixed bug where label/classes are out of sync with new NCBI downloaded filenames
- Added --rerender option to draw (new) graphics from old output, without recalculation
- Corrected matplotlib row dendrogram orientation
- Seaborn output no longer dumps core on large (ca. 500 genome) datasets
genbank_get_genomes_by_taxon.pyattempts to identify cause for failed downloads and correct, where nomenclature/versions are at fault- graceful replacement of classes that are not present in
classes.txt - add
pyaniversion to log file
- Merged pull request from peterjc to make printing from tests Python3-friendly.
- Merged pull request from peterjc to use
open()for opening files. - Merged pull request from peterjc to cope with missing labels/classes more gracefully
- Fixed
-s/--fragsizeoption inaverage_nucleotide_identity.py(thanks to Joseph Adelskov for hte report). - BLAST and
nucmerresults are now written to a subdirectory of the output folder. By default, these sequence search output files are compressed, but this behaviour can be suppressed using the--nocompressoption. - Added
genbank_get_genomes_by_taxon.pyas an aid to downloading publicly-available genome files from GenBank, for analysis.