-
Notifications
You must be signed in to change notification settings - Fork 9
2. Workflow concepts and structures
FlexTaxD is the multipurpose tool you need for managing taxonomic databases. It provides a streamlined solution for integrating taxonomies from various databases, such as NCBI and GTDB, as well as incorporating custom taxonomies. FlexTaxD also facilitates the preparation of data for downstream applications like metagenomic read classification. With FlexTaxD, you can consolidate taxonomic data into a FlexTaxD-database (Fdb), visualize it, and compile or format it for various classification tools.
Detailed pages for FlexTaxD operations are listed in the menu to the right.
Using FlexTaxD involves two primary commands:
-
flextaxd: For importing taxonomies, modifying the database, and exporting data. -
flextaxd-create: For creating and compiling the Fdb.
These commands are modular and may be used repeatedly at different stages of your workflow to manage taxonomy data, including:
- Importing external taxonomy.
- Modifying taxonomy within your Fdb.
- Adding external or custom taxonomy.
- Visualizing taxonomic trees.
- Downloading genome files.
- Compiling data for metagenomic read classification tools.
A standard FlexTaxD directory includes:
- Taxonomic input file(s) for creating the Fdb.
- Additional taxonomic file(s) for modifying the Fdb.
- The FlexTaxD database (Fdb) itself.
- An optional directory containing genome files.
- A temporary directory for intermediate files that can be retained post-execution.
FlexTaxD supports a range of taxonomy file formats:
-
NCBI Format:
names.dmpnodes.dmp- Optional:
*_accession2taxidfiles.
-
GTDB Format:
*_taxonomy.tsv
-
FlexTaxD Format:
tree2tax.tsvgenome2tax.tsv
-
CanSNPer Format:
tree2tax.txtgenome2tax.tsv
For detailed file format specifications, please refer to the "File Formats" page.
Genome sequences are essential for creating or formatting outputs for metagenomic classification databases. These sequences should be organized in a single folder and can be sourced as follows:
-
Manually or via external software, which includes:
- NCBI genome sets.
- GTDB representative genomes.
- Custom genome files.
-
Through FlexTaxD's genome download functionality:
Note: Some older versions may use the ncbi-genome-download tool.
To compile the Fdb into a metagenomic read classification database or to create structured files for other software, use:
-
flextaxd-createwith--create_dband--dbprogramarguments for compiling databases. -
flextaxdwith--outdirand--dump, and optionally--dbprogramarguments for exporting files.
For additional formatting options and further help, run flextaxd --help.