Skip to content

Commit c0099d4

Browse files
committed
Update usage
1 parent fdd85ad commit c0099d4

File tree

1 file changed

+4
-0
lines changed

1 file changed

+4
-0
lines changed

docs/usage.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -296,6 +296,10 @@ Notes:
296296

297297
By default, the input GTF file will be filtered to ensure that sequence names correspond to those in the genome fasta file, and to remove rows with empty transcript identifiers. Filtering can be bypassed completely where you are confident it is not necessary, using the `--skip_gtf_filter` parameter. If you just want to skip the 'transcript_id' checking component of the GTF filtering script used in the pipeline this can be disabled specifically using the `--skip_gtf_transcript_filter` parameter.
298298

299+
## Contamination screening options
300+
301+
The pipeline provides the option to scan unaligned reads for contamination from other species by using [Kraken2](https://ccb.jhu.edu/software/kraken2/) with or without corrections from [Bracken](https://ccb.jhu.edu/software/bracken/). As Bracken is not a particularly expensive algorithm, we recommend using it to correct the abundance estimations from Kraken. An important note is that Kraken2 is [sensitive to the database](https://doi.org/10.1099/mgen.0.000949) that it is used with. It is [particularly important](https://doi.org/10.1128/mbio.01607-23) to ensure that the host genome is part of the database, and if you are particularly concerned about specific contaminants, it may be worthwhile to use a smaller database that contains primarily those contaminants rather than the full standard database. Various pre-built databases can be found [here](https://benlangmead.github.io/aws-indexes/k2) and instructions for building a custom database can be found at the [Kraken2 documentation](https://github.com/DerrickWood/kraken2/blob/master/docs/MANUAL.markdown). Genomes of contaminants detected in previous sequencing experiments can be found on the [OpenContami website](https://openlooper.hgc.jp/opencontami/help/help_oct.php). Additionally, note that while one of the primary strengths of Kraken2 is that it can detect loaw abundance contaminants in a sample, false positives can also occur. If a very low number of reads of some contaminating species is detected, these results should be treated with caution.
302+
299303
## Running the pipeline
300304

301305
The typical command for running the pipeline is as follows:

0 commit comments

Comments
 (0)