Releases · Ecogenomics/GTDBTk

23 Mar 03:48

pchaumeil

2.2.6

890835f

2.2.6

Bug Fixes:

(#493) Fix issue with --full-tree flag (related to skipping ANI steps)

Minor changes:

Change URL for documentation to 'https://ecogenomics.github.io/GTDBTk/installing/index.html'
Improve portability of the ANI_screen step by regenerating the paths of reference genomes in the current filesystem for mash_db.msh

Assets 2

16 Mar 04:11

pchaumeil

2.2.5

9ec7071

2.2.5

Bug Fixes:

gtdbtk.json is now reset when the pipeline is re run and the status of ani_screen is not 'complete'

Minor changes:

When using --genes , ANI steps are skipped and warnings are raised to the user to
inform them that classification is less accurate.
(#486) Environment variables can be used in GTDBTK_DATA_PATH
is_consistent function in mash.py compares only the filenames, not the full paths
Add cutoff arguments to PfamScan ( Thanks @AroneyS for the contribution)

Contributors

AroneyS

Assets 2

28 Feb 00:38

pchaumeil

2.2.4

8b5b95d

2.2.4

Bug Fixes:

(#475) If all genomes are classified using ANI, Tk will skip the identify step and align steps

Minor changes:

Add hidden '--skip_pplacer' flag to skip pplacer step ( useful for debugging)
Improve documentation
Convert stage_logger to a Singleton class
Use existing ANI results if available

Assets 2

15 Feb 04:20

pchaumeil

2.2.3

476a4da

2.2.3

Bug Fixes:

Fix prodigal_fail_counter issue

Assets 2

14 Feb 21:00

pchaumeil

2.2.2

95b1358

2.2.2

Bug Fix:

(#471): Fix pplacer issue

Assets 2

14 Feb 11:37

aaronmussig

2.2.1

45bbba9

2.2.1

(#470) Add missing Pydantic dependency.

Assets 2

14 Feb 00:33

pchaumeil

2.2.0

3d7e936

2.2.0

Minor changes:

(#433) Added additional checks to ensure that the --outgroup_taxon cannot be set to a domain (root, de_novo_wf).
(#459/ #462 ) Fix deprecated np.bool in prodigal_biolib.py. Special thanks to @neoformit for his contribution.
(#466 ) RED value has been rounded to 5 decimals after the comma.
(#451 ) Extra checks have been added when Prodigal fails.
(#448) Warning has been added when all the genomes are filtered out and not classified.

Bug Fixes:

(#420 ) Fixed an issue where GTDB-Tk might hang when classifying TIGRFAM markers (identify, classify_wf, de_novo_wf). Special thanks to @lfenske-93 and @sjaenick for their contribution.
(#428) Fixed an issue where the --gtdbtk_classification_file would raise an error trying to read the classify summary (root, de_novo_wf).
(#439) Fix the pipeline when using protein files instead of nucleotide files. symlink uses absolute path instead.

Contributors

sjaenick, neoformit, and lfenske-93

Assets 2

11 Jul 05:11

pchaumeil

2.1.1

3656d71

2.1.1

(#399) Fix --genes options
(#400) Modify config.py file to resolve this issue
Updated documentation ( including #410 , documentation for itol)

Assets 2

12 May 03:23

pchaumeil

2.1.0

185cebc

2.1.0

Major changes:

GTDB-TK now uses a divide-and-conquer approach where the bacterial reference tree is split into multiple class-level subtrees. This reduces the memory requirements of GTDB-Tk from 320 GB of RAM when using the full GTDB R07-RS207 reference tree to approximately 55 GB. A manuscript describing this approach is in preparation. If you wish to continue using the full GTDB reference tree use the --full-tree flag. This is the main change from v2.0.0. The split tree approach has been modified from order-level trees to class-level trees to resolve specific classification issues (see #383).
Genomes that cannot be assigned to a domain (e.g. genomes with no bacterial or archaeal markers or genomes with no genes called by Prodigal) are now reported in the gtdbtk.bac120.summary.tsv as 'Unclassified'
Genomes filtered out during the alignment step are now reported in the gtdbtk.bac120.summary.tsv or gtdbtk.ar53.summary.tsv as 'Unclassified Bacteria/Archaea'
--write_single_copy_genes flag in now available in the classify_wf and de_novo_wf workflows.

Features:

(#392) --write_single_copy_genes flag available in workflows.
(#387) specific memory requirements set in classify_wf depending on the classification approach.

Important

This version is not backwards compatible with GTDB package R207 v1.
This version requires a new reference package

Assets 2

08 Apr 01:54

aaronmussig

2.0.0

7863333

2.0.0

Major changes:

GTDB-TK now uses a divide-and-conquer approach where the bacterial reference tree is split into multiple order-level subtrees. This reduces the memory requirements of GTDB-Tk from 320 GB of RAM when using the full GTDB R07-RS207 reference tree to approximately 35 GB. A manuscript describing this approach is in preparation. If you wish to continue using the full GTDB reference tree use the --full-tree flag.
Archaeal classification now uses a refined set of 53 archaeal-specific marker genes based on the recent publication by Dombrowski et al., 2020. This set of archaeal marker genes is now used by GTDB for curating the archaeal taxonomy.
By default, all directories containing intermediate results are now removed by default at the end of the classify_wf and de_novo_wf pipelines. If you wish to retain these intermediates files use the --keep-intermediates flag.
All MSA files produced by the align step are now compressed with gzip.
The classification summary and failed genomes files are now the only files linked in the root directory of classify_wf.

Features:

convert_to_itol to convert trees into iTOL format (#373)
Output FASTA files are compressed by default (#369)
Intermediate files will be removed by default when using classify/de-novo workflows unless specified by --keep_intermediates (#369)
Add --genes flag for Error (#362)
A warning will be displayed if pplacer fails to place a genome (#360 / #356)

Important

This version is not backwards compatible with GTDB release 202.
This version requires a new reference package

Assets 2

Releases: Ecogenomics/GTDBTk

2.2.6

2.2.6

Uh oh!

2.2.5

2.2.5

Contributors

Uh oh!

2.2.4

Uh oh!

2.2.3

2.2.3

Uh oh!

2.2.2

Uh oh!

2.2.1

Uh oh!

2.2.0

2.2.0

Contributors

Uh oh!

2.1.1

Uh oh!

2.1.0

Uh oh!

2.0.0

Uh oh!