-
Notifications
You must be signed in to change notification settings - Fork 9
7. Output
Outputting your FlexTaxD databases correctly is essential for their use in downstream applications like metagenomic classifiers. Below you will find instructions tailored for different tools.
Before exporting your FlexTaxD database, ensure it's clean:
flextaxd --db yourdatabase.fdb --clean # For NCBI-based databases
flextaxd --db yourdatabase.fdb --purge_database # For GTDB-based databasesThis step is crucial for removing unnecessary nodes that don't contribute valuable information for your specific analysis.
After ensuring your FlexTaxD database is ready, follow these commands to create databases for different classifiers.
flextaxd --database gtdb.fdb --genomes_path genomes --dbprogram kraken2 --create_db --db_name kraken2_gtdb --processes 20flextaxd --database gtdb.fdb --genomes_path genomes --dbprogram krakenuniq --create_db --db_name krakenuniq_gtdb --processes 20flextaxd --database gtdb.fdb --genomes_path genomes --dbprogram ganon --create_db --db_name ganon_gtdb --processes 20Remember to replace gtdb.fdb and genomes with the actual paths to your FlexTaxD database and genome directory, respectively.
FlexTaxD allows for database exporting in different formats, suitable for various downstream applications:
flextaxd --database gtdb.fdb --genomes_path genomes --dbprogram centrifuge --dumpflextaxd --database gtdb.fdb --genomes_path genomes --dbprogram bracken --dumpYou can customize the output format with a variety of flags:
-
--dump_sep: Define a custom separator for the output file (default mimics NCBI format). -
--dump_descriptions: Output the textual descriptions rather than the numeric identifiers. -
--dump_genomes: Produce a list of genomes with their sources in a separate file. -
--dump_genome_annotations: Include taxonomic annotations alongside genome listings.
Example command with custom formatting:
flextaxd --database gtdb.fdb --dump_sep "\t" --dump_descriptions --dump_genomes --dump_genome_annotationsThe above commands are structured to work with a FlexTaxD database named gtdb.fdb and a directory genomes that contains the genomic files. Adjust the paths and names as needed for your specific environment and database.
Lastly, ensure that the programs you're exporting to (Kraken2, Krakenuniq, Ganon, etc.) are installed and properly configured in your environment to recognize the databases you are creating. If you use Conda, these can be installed within your FlexTaxD environment, making sure they are available on your $PATH.