2.1.0
Major changes:
- GTDB-TK now uses a divide-and-conquer approach where the bacterial reference tree is split into multiple class-level subtrees. This reduces the memory requirements of GTDB-Tk from 320 GB of RAM when using the full GTDB R07-RS207 reference tree to approximately 55 GB. A manuscript describing this approach is in preparation. If you wish to continue using the full GTDB reference tree use the
--full-treeflag. This is the main change from v2.0.0. The split tree approach has been modified from order-level trees to class-level trees to resolve specific classification issues (see #383). - Genomes that cannot be assigned to a domain (e.g. genomes with no bacterial or archaeal markers or genomes with no genes called by Prodigal) are now reported in the
gtdbtk.bac120.summary.tsvas 'Unclassified' - Genomes filtered out during the alignment step are now reported in the
gtdbtk.bac120.summary.tsvorgtdbtk.ar53.summary.tsvas 'Unclassified Bacteria/Archaea' --write_single_copy_genesflag in now available in theclassify_wfandde_novo_wfworkflows.
Features:
- (#392)
--write_single_copy_genesflag available in workflows. - (#387) specific memory requirements set in classify_wf depending on the classification approach.
Important
This version is not backwards compatible with GTDB package R207 v1.
This version requires a new reference package