-
Notifications
You must be signed in to change notification settings - Fork 9
Home
Flexible Taxonomy Databases - A cross platform tool for customization and merging of various taxonomic classification sources.
New to FlexTaxD? Here is a Walkthrough to get started!
Supported sources in version later than v0.2.0:
- QIIME (GTDB)
- NCBI
- CanSNPer
- TSV
Supported database build of downstream classification tools
- Kraken2
- Krakenuniq
- Ganon (from v 3.0)
The flextaxd (flextaxd) script allows customization of databases from NCBI, QIIME or CanSNPer sources and supports export functions into NCBI formatted names and nodes.dmp files as well as a standard tab separated file (or a selected separation). The script was initially written to allow the use of GTDB with some custom modifications to allow increased resolution of selected subgroups. GTDB was created by an Australian group aimed to restructure the taxonomy relation from the NCBI taxonomy tree to strictly follow a phylogenetic structure (http://gtdb.ecogenomic.org/) this script can use the bac120_taxonomy_r89.tsv files from the GTDB downloads page as input (with the --taxonomy_type selected as QIIME). By default the script will read a Tab separated file containing parent and child (defined by column headers). The script also allows customization of the database using multiple sources and databases can be merged at a selected node(s) there is also an option to add resolution to certain subgroups (ie combine the different database types) using a tab separated file (format described below).
All data is kept in a sqlite3 database (.ftd by default) and can be dumped to NCBI formatted names and nodes.dmp files. Supported export formats are NCBI and TSV). The TSV dump format is similar to the NCBI dump except that it contains a header (parentchild), has parent on the left and only uses tab to separate each column (not <tab>|<tab>).