Skip to content

quadram-institute-bioscience/hostzap

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

hostzap

hostzap logo

A Nextflow pipeline for host read removal from paired-end sequencing data. Given a samplesheet of FASTQ files, hostzap runs one or more host depletion tools and outputs the cleaned reads alongside summary statistics.

Tools

Tool Method
Kraken2 k-mer classification
BBMap Sequence alignment
Hostile Targeted host removal
Deacon Host sequence depletion

Each tool can be skipped individually with --skip_kraken2, --skip_bbmap, --skip_hostile, or --skip_deacon.

Usage

nextflow run main.nf \
  --input samplesheet.csv \
  --outdir results \
  --kraken2_db /path/to/kraken2_db \
  --hostile_index human-t2t-hla

The samplesheet must be a CSV with three columns:

sample-id,forward-absolute-filepath,reverse-absolute-filepath
sample1,/data/sample1_R1.fastq.gz,/data/sample1_R2.fastq.gz

Requirements

  • Nextflow (DSL2)
  • Docker, Singularity (conda is not tested and not recommended)

About

Host removal

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors