Skip to content

TRON-Bioinformatics/DEEPctMUT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DEEPctMUT

DEEPctMUT is a Nextflow pipeline for highly accurate mutation detection in circulating tumor DNA (ctDNA) from plasma and optionally matchhed PBMC samples.

DEEPctMUT takes as input FASTQ files from paired-end sequencing with unique molecular barcode (UMI) of a plasma samples and optinaly matched PBMC samples. After candidate calling and multiple steps of error polishing DEEPctMUT outputs a VCF file with somatic mutations in ctDNA.

Dependeny

  • nextflow >= 24.10.4
  • miniconda >= 24.3.0

How to run

Test case

To test the pipeline on the provided toy dataset, download or git clone the repository and run the following command:

nextflow main.nf -profile conda,test --output_dir OUTUPUT_DIR

General case

To run DEEPctMUT on your own data, you need the hg19 reference genome indexed for BWA, as well as a dictionary file. You also need to prepare a single input_table, as csv file WITH header and formatted as follows:

patient sample Fastq1 Fastq2 type replicate
Patient_1 Patient_1_plasma /path/to/read1.fastq /path/to/read2.fastq Plasma 1
Patient_1 Patient_1_pbmc /path/to/read1.fastq /path/to/read2.fastq PBMC 1

Then run the following command:

nextflow main.nf -profile conda --input_table INPUT_TABLE --reference REFERENCE --output_dir OUTUPUT_DIR

Arguments

The pipeline accepts the following command-line arguments:

Argument Description Required/Default
--input_table CSV file with sample information. Required unless -profile test
--output_dir Output directory for results. Required
--reference hg19 Reference genome FASTA file. BWA index and dictionary files also need to be in the same directory. Required
--bed_file BED file with target regions. Default: test_data/CRC_panel.bed
--with_pbmc Whether to process PBMC samples for background filtering. Default: true
--RF_threshold Random Forest model threshold for variant calling. Default: 0.21
--DeepES_threshold DeepES model threshold to call a mutation significant. Default: 0.01
--non_hotspot_reads Minimum number of reads supporting a mutation on non-hostpot regions. Default: 30
--min_vaf Minimum variant allele frequency (VAF) to call a mutation. Default: 0.0003

Output

DEEPctMUT outpus for each patient a VCF file with mutation calls in the specified output_dir directory.

About

Tumor-naïve ctDNA mutation detection pipeline

Resources

License

Stars

Watchers

Forks

Contributors 2

  •  
  •