Skip to content

Conversation

@xens25
Copy link

@xens25 xens25 commented Dec 1, 2025

FOR CONTRIBUTOR:

  • I have read the CONTRIBUTING.md document and this tool is appropriate for the tools-iuc repo.
  • License permits unrestricted use (educational + commercial)
  • This PR adds a new tool or tool collection
  • This PR updates an existing tool or tool collection
  • This PR does something else (explain below)

categories:
- Sequence Analysis
- Metagenomics
- Quality Control
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Quality Control

@@ -0,0 +1,272 @@
<tool id="kneaddata" name="KneadData" version="0.12.1+galaxy0" python_template_version="3.5" profile="21.05">
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<tool id="kneaddata" name="KneadData" version="0.12.1+galaxy0" python_template_version="3.5" profile="21.05">
<tool id="kneaddata" name="KneadData" version="@TOOL_VERSION@+galaxy@VERSION_SUFFIX@" profile="@PROFILE@">

Please introduce the above tokens in the macros.xml file

Comment on lines +3 to +9
<requirements>
<requirement type="package" version="0.12.3">kneaddata</requirement>
<requirement type="package" version="0.40">trimmomatic</requirement>
<requirement type="package" version="2.5.4">bowtie2</requirement>
<requirement type="package" version="4.09.1">trf</requirement>
<requirement type="package" version="0.12.1">fastqc</requirement>
</requirements>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can go in macros.xml. Wondering if these dependencies are already part of Kneaddata or does one explicitly need them?

Comment on lines +54 to +55
<option value="s">Single read</option>
<option value="p">Paired reads</option>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<option value="s">Single read</option>
<option value="p">Paired reads</option>
<option value="single">Single read</option>
<option value="paired">Paired reads</option>

</param>

<when value="s">
<param name="single_read" type="data" format="fastq" label="Single Read"/>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<param name="single_read" type="data" format="fastq" label="Single Read"/>
<param name="single_read" type="data" format="fastqsanger" label="Single Read"/>

Does it also support fastq.gz?

<when value="s">
<param name="single_read" type="data" format="fastq" label="Single Read"/>
</when>
<when value="p">
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<when value="p">
<when value="paired">

Comment on lines +71 to +73
<param name="number_threads" type="integer" value="1" label="Number of threads"/>

<param name="number_processes" type="integer" value="1" label="Number of processes"/>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<param name="number_threads" type="integer" value="1" label="Number of threads"/>
<param name="number_processes" type="integer" value="1" label="Number of processes"/>

This is not required


<section name="trimmomatic" title="Trimmomatic arguments" >

<param name="max_memory" type="text" value="500m" label="Maximum memory for Trimmomatic"/>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<param name="max_memory" type="text" value="500m" label="Maximum memory for Trimmomatic"/>


<conditional name="trimmomatic_options">
<param name="select_option" type="select" label="Trimmomatic settings">
<option value="d">Default settings</option>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better names for option values?

quality filtering, and removal of host contamination using
Bowtie2/TRIMMOMATIC/TRF.
homepage_url: https://github.com/biobakery/kneaddata
remote_repository_url: https://github.com/galaxyproject/tools-iuc/tree/master/tools/kneaddata
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
remote_repository_url: https://github.com/galaxyproject/tools-iuc/tree/master/tools/kneaddata
remote_repository_url: https://github.com/galaxyproject/tools-iuc/tree/main/tools/kneaddata

owner: iuc
type: unrestricted
description: Quality control and contaminant removal for metagenomic data
long_description: >
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
long_description: >
long_description: |

remote_repository_url: https://github.com/galaxyproject/tools-iuc/tree/master/tools/kneaddata
categories:
- Metagenomics
- Statistics
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Statistics
- Sequence Analysis

@SaimMomin12 SaimMomin12 changed the title Add KneadData tool wrapper New tool addition: KneadData Dec 1, 2025
<requirement type="package" version="0.12.1">fastqc</requirement>
</requirements>
<command detect_errors="exit_code"><![CDATA[
kneaddata
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
kneaddata
mkdir -p results/
kneaddata

-i1 "$read_type.forward_read"
-i2 "$read_type.backward_read"
#end if
-o "output_dir"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
-o "output_dir"
-o results/

metagenomic and metatranscriptomic sequencing data, especially
data from microbiome experiments. It performs adapter trimming,
quality filtering, and removal of host contamination using
Bowtie2/TRIMMOMATIC/TRF.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Bowtie2/TRIMMOMATIC/TRF.
Bowtie2, TRIMMOMATIC and TRF.

Comment on lines +19 to +21
#if "$output_prefix"
--output-prefix "$output_prefix"
#end if
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be in favor of removing this parameter

Comment on lines +22 to +23
--threads "$number_threads"
--processes "$number_processes"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
--threads "$number_threads"
--processes "$number_processes"

Comment on lines +163 to +164
<param name="number_threads" value="1"/>
<param name="number_processes" value="1"/>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<param name="number_threads" value="1"/>
<param name="number_processes" value="1"/>

</section>
<section name="read_type">
<param name="select_read_type" value="s"/>
<param name="single_read" value="28C.single.fastq"/>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<param name="single_read" value="28C.single.fastq"/>
<param name="single_read" value="test_single.fastq"/>

<param name="forward_read" value="28C.R1.fastq"/>
<param name="backward_read" value="28C.R2.fastq"/>
</section>
<output name="paired_forward" file="paired_forward.fastq"/>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<output name="paired_forward" file="paired_forward.fastq"/>
<output name="paired_forward" file="test_paired_1.fastq"/>

<param name="backward_read" value="28C.R2.fastq"/>
</section>
<output name="paired_forward" file="paired_forward.fastq"/>
<output name="paired_backward" file="paired_backward.fastq"/>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<output name="paired_backward" file="paired_backward.fastq"/>
<output name="paired_backward" file="test_paired_2.fastq"/>

Comment on lines +201 to +259
usage: kneaddata [-h] [--version] [-v] [-i1 INPUT1] [-i2 INPUT2]
[-un UNPAIRED] -o OUTPUT_DIR
[-db REFERENCE_DB] [--bypass-trim] [--run-trim-repetitive]
[--output-prefix OUTPUT_PREFIX] [-t &lt;1&gt;] [-p &lt;1&gt;]
[-q {phred33,phred64}] [--run-bmtagger]
[--run-fastqc-start] [--run-fastqc-end] [--store-temp-output]
[--cat-final-output]
[--log-level {DEBUG,INFO,WARNING,ERROR,CRITICAL}] [--log LOG]
[--trimmomatic TRIMMOMATIC_PATH] [--max-memory MAX_MEMORY]
[--trimmomatic-options TRIMMOMATIC_OPTIONS]
[--bowtie2 BOWTIE2_PATH] [--bowtie2-options BOWTIE2_OPTIONS]
[--bmtagger BMTAGGER_PATH] [--trf TRF_PATH] [--match MATCH]
[--mismatch MISMATCH] [--delta DELTA] [--pm PM] [--pi PI]
[--minscore MINSCORE] [--maxperiod MAXPERIOD]
[--fastqc FASTQC_PATH]

KneadData

options:


-h, --help show this help message and exit
-v, --verbose additional output is printed
--version show program's version number and exit
-i INPUT, --input INPUT input FASTQ file (add a second argument instance to run with paired input files)
-o OUTPUT_DIR, --output OUTPUT_DIR directory to write output files
--db REFERENCE_DB, --reference-db REFERENCE_DB location of reference database
--run-trim-repetitive Option to trim repetitive/overrepresented sequences generated by FASTQC reports
--bypass-trim bypass the trim step
--output-prefix OUTPUT_PREFIX prefix for all output files [ DEFAULT : $SAMPLE_kneaddata ]
-t <1>, --threads <1> number of threads [ Default : 1 ]
-p <1>, --processes <1> number of processes [ Default : 1 ]
-q <quality>, --quality-scores <quality> quality scores [phred33|phred64] [DEFAULT: phred33]
--run-bmtagger run BMTagger instead of Bowtie2 to identify contaminant reads
--bypass-trf option to bypass the removal of tandem repeats
--run-fastqc-start run fastqc at the beginning of the workflow
--run-fastqc-end run fastqc at the end of the workflow
--store-temp-output store temp output files [ DEFAULT : temp output files are removed ]
--cat-final-output concatenate all final output files [ DEFAULT : final output is not concatenated ]
--log-level <DEBUG|INFO|WARNING|ERROR|CRITICAL> level of log messages [DEFAULT: DEBUG]
--log LOG log file [ DEFAULT : $OUTPUT_DIR/$SAMPLE_kneaddata.log ]
--trimmomatic TRIMMOMATIC_PATH path to trimmomatic [ DEFAULT : $PATH ]
--max-memory MAX_MEMORY max amount of memory [ DEFAULT : 500m ]
--trimmomatic-options TRIMMOMATIC_OPTIONS options for trimmomatic [ DEFAULT : SLIDINGWINDOW:4:20 MINLEN:50 ]
MINLEN is set to 50 percent of total input read length. The user can alternatively specify a length (in bases) for MINLEN.
--sequencer-source options for sequencer-source [ DEFAULT: NexteraPE] Available sequencers: ["NexteraPE","TruSeq2","TruSeq3"]
--bowtie2 BOWTIE2_PATH path to bowtie2 [ DEFAULT : $PATH ]
--bowtie2-options BOWTIE2_OPTIONS options for bowtie2 [ DEFAULT : --very-sensitive ]
--bmtagger BMTAGGER_PATH path to BMTagger [ DEFAULT : $PATH ]
--bypass-trf bypass the TRF step
--trf TRF_PATH path to TRF [ DEFAULT : $PATH ]
--mismatch MISMATCH mismatching penalty [ DEFAULT : 7 ]
--delta DELTA indel penalty [ DEFAULT : 7 ]
--pm PM match probability [ DEFAULT : 80 ]
--pi PI indel probability [ DEFAULT : 10 ]
--minscore MINSCORE minimum alignment score to report [ DEFAULT : 50 ]
--maxperiod MAXPERIOD maximum period size to report [ DEFAULT : 500 ]
--fastqc FASTQC_PATH path to fastqc [ DEFAULT : $PATH ]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A better help could be what is KneadData tool, what does it do, what inputs it requires, what outputs it gives out etc. Perhaps explanation of a few important parameters

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants