You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
-`--adapter_sequence`: Adapter for read 1. It disables auto-detection for SE reads.
106
+
-`--adapter_sequence_r2`: Adapter for read 2 (for PE data). For PE data, the specified adapter sequences are used only when auto-detection fails.
107
+
-`--adapter_fasta`: FASTA file of adapter sequences. They are used after trimming adapters that are either auto-detected or specified with `--adapter_sequence` or `--adapter_sequence_r2`.
108
+
109
+
### Low complexity filtering (*Disabled* by default)
You can provide additional parameters in a text file. The file should contain one parameter per line, and each line should start with the parameter name followed by its value. Parameters here will override the GUI settings.
168
+
Check [fastp](https://github.com/OpenGene/fastp) and [fastplong](https://github.com/OpenGene/fastplong) GitHub repository for parameter list.
169
+
Please use long options (e.g., `--disable_quality_filtering`) instead of short options (e.g., `-Q`).
170
+
For example:
171
+
```
172
+
--disable_quality_filtering
173
+
--qualified_quality_phred 20
174
+
--unqualified_percent_limit 30
175
+
```
176
+
177
+
178
+
179
+
180
+
181
+
182
+
183
+
75
184
# Classification
76
185
Metabuli App provides two taxonomic profiling modes in **Search Settings** panel: **New Search** and **Upload Report**.
#### You can perform taxonomic classification on one or more samples using a specified database.
80
190
### Required Fields:
81
191
1.**Mode:** Select the analysis mode among single-end, paired-end, or long-read.
82
-
2.**Job ID:** Enter a unique identifier for the job.
83
-
3.**Select Files:** Upload the necessary files and directories.
192
+
2.**Enable Quality Control:** Check it to enable quality control for the input reads.
193
+
-`fastp` and `fastplong` are used for short and long reads, respectively.
194
+
- Please see QC documentation for more details.
195
+
3.**Job ID:** Enter a unique identifier for the job.
196
+
4.**Select Files:** Upload the necessary files and directories.
84
197
- Read 1 File (and Read 2 File if Paired-end is selected)
198
+
- FASTA/FASTQ and their gzipped versions are supported.
199
+
-`ADD ENTRY` to upload **multiple samples** to process using the same settings.
85
200
- Database Directory
86
201
- Output Directory
87
-
4.**Max RAM:** Specify the maximum RAM (in GiB) to allocate for the job.
202
+
- Result files are saved in `Job ID` directory under the specified output directory.
203
+
- When **multiple samples** are processed, results are saved in `Job ID/sample_name` directories.
204
+
5.**Max RAM:** Specify the maximum RAM (in GiB) to allocate for the job.
88
205
89
206
### Advanced Settings (Optional):
90
207
-**Threads:** Specify thread count for the job.
@@ -111,6 +228,27 @@ Metabuli App provides two taxonomic profiling modes in **Search Settings** panel
111
228
-**Sankey Diagram**: A flow diagram representing the lineage information of the displayed taxa.
112
229
-**Krona Chart**: A hierarchical interactive chart that visualizes classification results.
113
230
231
+
### Generated Result Files:
232
+
#### 1. JobID_classifications.tsv: It contains the classification results for each read. The columns are as follows.
233
+
1.`is_classified`: Classified or not
234
+
2.`name`: Read ID
235
+
3.`taxID`: Tax. ID in the tax. dump files used in database creation
236
+
4.`query_length`: Effective read length
237
+
5.`score`: DNA level identity score
238
+
6.`rank`: Taxonomic rank of the taxon
239
+
7.`taxID:match_count`: List of "taxID : k-mer match count"
240
+
241
+
#### 2. JobID_report.tsv: It follows Kraken2's report format. The first line is a header, and the rest of the lines are tab-separated values. The columns are as follow.
242
+
243
+
1.`clade_proportion`: Percentage of reads classified to the clade rooted at this taxon
244
+
2.`clade_count`: Number of reads classified to the clade rooted at this taxon
245
+
3.`taxon_count`: Number of reads classified directly to this taxon
246
+
4.`rank`: Taxonomic rank of the taxon
247
+
5.`taxID`: Tax ID according to the taxonomy dump files used in the database creation
248
+
6.`name`: Taxonomic name of the taxon
249
+
250
+
#### 3. JobID_krona.html: It is for an interactive Krona plot. You can use any modern web browser to open `JobID_krona.html`.
251
+
114
252
## Upload Report
115
253
116
254
To visualize results from a previously completed job:
@@ -123,6 +261,11 @@ To visualize results from a previously completed job:
123
261
124
262
---
125
263
264
+
# Database Curation
265
+
266
+
## Download Database
267
+
You can download pre-built databases [here](https://metabuli.steineggerlab.workers.dev/).
268
+
126
269
## Create New Database
127
270
You can create a new database in "NEW DATABASE" tab by providing these three files:
128
271
1.**FASTA files** : Each sequence must have a unique `>accession.version` or `>accesion` header (e.g., `>CP001849.1` or `>CP001849`).
0 commit comments