Skip to content
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions workflows/microbiome/metagenomic-genes-catalogue/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,20 @@
# Changelog

## [1.2] - 2025-01-13

### Automatic update

* `toolshed.g2.bx.psu.edu/repos/recetox/table_pandas_rename_column/table_pandas_rename_column/2.2.3+galaxy0` should be updated to `toolshed.g2.bx.psu.edu/repos/recetox/table_pandas_rename_column/table_pandas_rename_column/2.2.3+galaxy1`

### Added

* Addition of a Boolean variable allowing users to choose whether users want a full genes catalog analysis or one specific to antibiotic resistance.
* Added a workflow to retrieve the contig IDs and CDSs of antibiotic resistance genes, and to filter genes present on the same contigs as antibiotic resistance genes (if the Boolean value is False).

## [1.1] - 2025-12-08

### Automatic update

- `toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_cat/9.5+galaxy0` was updated to `toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_cat/9.5+galaxy2`
- `toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_awk_tool/9.5+galaxy0` was updated to `toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_awk_tool/9.5+galaxy2`
- `toolshed.g2.bx.psu.edu/repos/iuc/seqkit_translate/seqkit_translate/2.10.0+galaxy0` was updated to `toolshed.g2.bx.psu.edu/repos/iuc/seqkit_translate/seqkit_translate/2.12.0+galaxy0`
Expand Down
21 changes: 17 additions & 4 deletions workflows/microbiome/metagenomic-genes-catalogue/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,20 +2,33 @@

This workflow generates genes catalogue from paired short reads.

The workflow supports assembly using **MEGAHIT**.
The workflow supports assembly using **MEGAHIT** and assembly quality control via **QUAST**.

## Genes catalogue Annotation and Quality Control

After assembly, CDS are detected from resulting contigs with **Prodigal** and the potential genes are clustered with **MMseqs2linclust**
After assembly, CDS are detected from resulting contigs with **Prodigal**

The following processing steps are then performed:
Then, depending on your choice of Boolean, this will lead to two different analyses :

If the **Boolean is True** for the complete genes catalog :

All the potential genes are clustered with **MMseqs2linclust**

The following processing steps are then performed on the full clustered genes catalogue :

- **Genes annotation** with Eggnog-mapper
- **Taxonomic Assignment** using MMseqs2taxonomy
- **Assembly Quality Control** via QUAST
- **Abundance Estimation** per sample with CoverM
- **AMR detection** with ABRicate, AMRFinderPlus and starAMR

If the **Boolean is False** for the complete genes catalog, a specific focus is placed on the functions and taxonomies associated with contigs on which an antibiotic resistance gene is detected. :

This will initiate the construction of a genes catalogue specific to **antibiotic resistance genes (ARGs)**.

1. **AMR detection** with ABRicate, AMRFinderPlus and starAMR is performed on CDS predicted with **Prodigal.**
2. **Taxonomic and functional annotation** will be performed on CDSs present on the same contig as the ARGs catalogue.
3. **Clustering** is then performed on the detected ARGs, followed by **coverage** of this clustered ARGs catalog with the sample reads.

## Input Requirements

Input reads must be quality-filtered, with host reads removed.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,19 +17,20 @@
eggNOG database: 5.0.2
mmseqs2 taxonomy DB: UniRef50-17-b804f-07112025
starAMR database: staramr_downloaded_07042025_resfinder_d1e607b_pointfinder_694919f_plasmidfinder_3e77502
Virulence genes detection database: vfdb
Virulence genes detection database: resfinder
AMR genes detection database: amrfinderplus_V3.12_2024-05-02.2
Full genes catalogue: false
outputs:
MMseqs2 Taxonomy Filtered:
MMseqs2 Taxonomy Tabular:
asserts:
has_text:
text: "domain"
text: "Bacteria"
has_n_columns:
n: 4
Eggnog Annotation Filtered:
Eggnog Annotations:
asserts:
has_text:
text: "Contig id"
text: "Beta-lactamase class A"
has_n_columns:
n: 21
MMseqs2 Taxonomy Kraken:
Expand Down Expand Up @@ -60,37 +61,17 @@
text: "#FILE"
has_n_columns:
n: 15
Amrfinderplus Report:
asserts:
has_text:
text: "Protein identifier"
has_n_columns:
n: 22
Tooldistillator Summarize Collection:
element_tests:
genes_catalogue_test:
asserts:
- that: has_text
text: "megahit_report"
- that: has_text
text: "quast_report"
- that: has_text
text: "prodigal_report"
- that: has_text
text: "coverm_report"
Tooldistillator Summarize Catalogue:
MultiQC Report:
asserts:
- that: has_text
text: "eggnogmapper_report"
- that: has_text
text: "mmseqs2linclust_report"
- that: has_text
text: "staramr_report"
- that: has_text
text: "mmseqs2taxonomy_report"
- that: has_text
text: "abricate_report"
- that: has_text
text: "amrfinderplus_report"
- that: has_text
text: "argnorm_report"
- that: has_text
text: "AMRFinderPlus"
- that: has_text
text: "ABRicate"
- that: has_text
text: "starAMR"
- that: has_text
text: "QUAST"
- that: has_text
text: "MMseqs2 taxonomy"
- that: has_text
text: "EggnogMapper"
Loading