Skip to content

2.1.0

Choose a tag to compare

@pchaumeil pchaumeil released this 12 May 03:23
· 255 commits to master since this release

Major changes:

  • GTDB-TK now uses a divide-and-conquer approach where the bacterial reference tree is split into multiple class-level subtrees. This reduces the memory requirements of GTDB-Tk from 320 GB of RAM when using the full GTDB R07-RS207 reference tree to approximately 55 GB. A manuscript describing this approach is in preparation. If you wish to continue using the full GTDB reference tree use the --full-tree flag. This is the main change from v2.0.0. The split tree approach has been modified from order-level trees to class-level trees to resolve specific classification issues (see #383).
  • Genomes that cannot be assigned to a domain (e.g. genomes with no bacterial or archaeal markers or genomes with no genes called by Prodigal) are now reported in the gtdbtk.bac120.summary.tsv as 'Unclassified'
  • Genomes filtered out during the alignment step are now reported in the gtdbtk.bac120.summary.tsv or gtdbtk.ar53.summary.tsv as 'Unclassified Bacteria/Archaea'
  • --write_single_copy_genes flag in now available in the classify_wf and de_novo_wf workflows.

Features:

  • (#392) --write_single_copy_genes flag available in workflows.
  • (#387) specific memory requirements set in classify_wf depending on the classification approach.

Important

This version is not backwards compatible with GTDB package R207 v1.
This version requires a new reference package