Releases: Ecogenomics/GTDBTk
Releases · Ecogenomics/GTDBTk
2.2.6
2.2.6
Bug Fixes:
- (#493) Fix issue with --full-tree flag (related to skipping ANI steps)
Minor changes:
- Change URL for documentation to 'https://ecogenomics.github.io/GTDBTk/installing/index.html'
- Improve portability of the ANI_screen step by regenerating the paths of reference genomes in the current filesystem for mash_db.msh
2.2.5
2.2.5
Bug Fixes:
gtdbtk.jsonis now reset when the pipeline is re run and the status ofani_screenis not 'complete'
Minor changes:
- When using
--genes, ANI steps are skipped and warnings are raised to the user to
inform them that classification is less accurate. - (#486) Environment variables can be used in GTDBTK_DATA_PATH
is_consistentfunction inmash.pycompares only the filenames, not the full paths- Add cutoff arguments to PfamScan ( Thanks @AroneyS for the contribution)
2.2.4
Bug Fixes:
- (#475) If all genomes are classified using ANI, Tk will skip the identify step and align steps
Minor changes:
- Add hidden '--skip_pplacer' flag to skip pplacer step ( useful for debugging)
- Improve documentation
- Convert stage_logger to a Singleton class
- Use existing ANI results if available
2.2.3
2.2.2
2.2.1
2.2.0
2.2.0
Minor changes:
- (#433) Added additional checks to ensure that the
--outgroup_taxoncannot be set to a domain (root,de_novo_wf). - (#459/ #462 ) Fix deprecated np.bool in prodigal_biolib.py. Special thanks to @neoformit for his contribution.
- (#466 ) RED value has been rounded to 5 decimals after the comma.
- (#451 ) Extra checks have been added when Prodigal fails.
- (#448) Warning has been added when all the genomes are filtered out and not classified.
Bug Fixes:
- (#420 ) Fixed an issue where GTDB-Tk might hang when classifying TIGRFAM markers (
identify,classify_wf,de_novo_wf). Special thanks to @lfenske-93 and @sjaenick for their contribution. - (#428) Fixed an issue where the
--gtdbtk_classification_filewould raise an error trying to read theclassifysummary (root,de_novo_wf). - (#439) Fix the pipeline when using protein files instead of nucleotide files. symlink uses absolute path instead.
2.1.1
2.1.0
Major changes:
- GTDB-TK now uses a divide-and-conquer approach where the bacterial reference tree is split into multiple class-level subtrees. This reduces the memory requirements of GTDB-Tk from 320 GB of RAM when using the full GTDB R07-RS207 reference tree to approximately 55 GB. A manuscript describing this approach is in preparation. If you wish to continue using the full GTDB reference tree use the
--full-treeflag. This is the main change from v2.0.0. The split tree approach has been modified from order-level trees to class-level trees to resolve specific classification issues (see #383). - Genomes that cannot be assigned to a domain (e.g. genomes with no bacterial or archaeal markers or genomes with no genes called by Prodigal) are now reported in the
gtdbtk.bac120.summary.tsvas 'Unclassified' - Genomes filtered out during the alignment step are now reported in the
gtdbtk.bac120.summary.tsvorgtdbtk.ar53.summary.tsvas 'Unclassified Bacteria/Archaea' --write_single_copy_genesflag in now available in theclassify_wfandde_novo_wfworkflows.
Features:
- (#392)
--write_single_copy_genesflag available in workflows. - (#387) specific memory requirements set in classify_wf depending on the classification approach.
Important
This version is not backwards compatible with GTDB package R207 v1.
This version requires a new reference package
2.0.0
Major changes:
- GTDB-TK now uses a divide-and-conquer approach where the bacterial reference tree is split into multiple order-level subtrees. This reduces the memory requirements of GTDB-Tk from 320 GB of RAM when using the full GTDB R07-RS207 reference tree to approximately 35 GB. A manuscript describing this approach is in preparation. If you wish to continue using the full GTDB reference tree use the
--full-treeflag. - Archaeal classification now uses a refined set of 53 archaeal-specific marker genes based on the recent publication by Dombrowski et al., 2020. This set of archaeal marker genes is now used by GTDB for curating the archaeal taxonomy.
- By default, all directories containing intermediate results are now removed by default at the end of the
classify_wfandde_novo_wfpipelines. If you wish to retain these intermediates files use the--keep-intermediatesflag. - All MSA files produced by the
alignstep are now compressed with gzip. - The classification summary and failed genomes files are now the only files linked in the root directory of
classify_wf.
Features:
convert_to_itolto convert trees into iTOL format (#373)- Output FASTA files are compressed by default (#369)
- Intermediate files will be removed by default when using classify/de-novo workflows unless specified by
--keep_intermediates(#369) - Add --genes flag for Error (#362)
- A warning will be displayed if pplacer fails to place a genome (#360 / #356)
Important
- This version is not backwards compatible with GTDB release 202.
- This version requires a new reference package