You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -26,8 +26,9 @@ nextflow run main.nf -profile (standard/esrum/ngc),test -stub
26
26
nextflow run main.nf -profile (standard/esrum),test
27
27
```
28
28
29
-
### 5. Run pipeline with input data (see test data for file contents):
30
-
29
+
### 5. Run pipeline with input data:
30
+
The pipeline has multiple optional configurations found in ```nextflow.config```.
31
+
Configurations can be supplied as a ```config.json``` and run with ```nextflow run main.nf -profile (standard/esrum) -params-file config.json```, or directly from the commandline:
31
32
```Bash
32
33
nextflow run main.nf \
33
34
-profile (standard/esrum/ngc) \
@@ -37,6 +38,71 @@ nextflow run main.nf \
37
38
--bedfile <path to bedfile with target regions> \
38
39
--ploidy <integer>
39
40
```
41
+
The ```pooltable.tsv``` should connect (user assigned) pool id's to input FASTQ files; one entry for each pool.
The ```decodetable.tsv``` should map (user assigned) individual id's in the matrix to the corresponding row and column id's of each pool; one entry for each element in the matrix.
47
+
```Bash
48
+
individual1 pool_row_1 pool_column_1
49
+
```
50
+
### 6. Pipeline output
51
+
The workflow will output a results folder containing multiple config dependent output files:
52
+
```Bash
53
+
results
54
+
├── pinpointables.vcf # Merged VCF file containing all assigned variants
55
+
├── cram/ # CRAM files for each pool
56
+
├── logs/ # Log files for each process
57
+
├── variants/ # VCF files for each pool
58
+
├── variant_tables/ # TSV files converted from pool VCFs
59
+
└── pinpoint_variants/
60
+
├── all_pins/ # All pinpointables for each sample in individual vcfs (*note)
61
+
├── unique_pins/ # All unique pinpointables for each sample in individual vcfs (*note)
62
+
├── *_merged.vcf.gz # All pinpointables for all samples in a single vcf without sample information
63
+
├── summary.tsv # Variant counts for each sample
64
+
└── lookup.tsv # Variant to sample lookup table
65
+
```
66
+
A central files is the ```pinpointables.vcf```. This file contains all individually assigned variants. Since each variant contains information from two pools, these a presented as the sample columns: ROW and COLUMN.
67
+
68
+
# Workflow repository contents:
69
+
70
+
```Bash
71
+
DoBSeqWF
72
+
├── LICENSE
73
+
├── VERSION
74
+
├── README.md
75
+
├── assets
76
+
│ ├── data
77
+
│ │ ├── reference_genomes
78
+
│ │ │ └── small
79
+
│ │ │ └── small_reference.*
80
+
│ │ └── test_data
81
+
│ │ ├── coordtable.tsv
82
+
│ │ ├── decodetable.tsv
83
+
│ │ ├── pools
84
+
│ │ │ └── *.fq.gz
85
+
│ │ ├── pooltable.tsv
86
+
│ │ ├── snvlist.tsv
87
+
│ │ └── target_calling.bed
88
+
│ └── helper_scripts
89
+
│ └── simulator.py # Script for simulating minimal pipeline data
90
+
├── bin # Executable pipeline scripts
91
+
│ └── <script>.*
92
+
├── conf
93
+
│ └── profiles.config # Configuration profiles for compute environments
├── next.pbs # Helper script for running on NGC-HPC
103
+
└── nextflow.config # Workflow parameters
104
+
```
105
+
40
106
41
107
## Usage on NGC-HPC
42
108
@@ -165,42 +231,3 @@ tail nextflow.log
165
231
166
232
If the pipeline fails - it is likely due to resource constraints. Adjust as needed in the conf/profiles.config file under NGC, and rerun the PBS script. Be aware that any direct edits of the workflow scripts, ie. modules and subworkflows, can lead to complete re-run of the pipeline.
167
233
168
-
169
-
# Workflow repository contents:
170
-
171
-
```Bash
172
-
DoBSeqWF
173
-
├── LICENSE
174
-
├── VERSION
175
-
├── README.md
176
-
├── assets
177
-
│ ├── data
178
-
│ │ ├── reference_genomes
179
-
│ │ │ └── small
180
-
│ │ │ └── small_reference.*
181
-
│ │ └── test_data
182
-
│ │ ├── coordtable.tsv
183
-
│ │ ├── decodetable.tsv
184
-
│ │ ├── pools
185
-
│ │ │ └── *.fq.gz
186
-
│ │ ├── pooltable.tsv
187
-
│ │ ├── snvlist.tsv
188
-
│ │ └── target_calling.bed
189
-
│ └── helper_scripts
190
-
│ └── simulator.py # Script for simulating minimal pipeline data
191
-
├── bin # Executable pipeline scripts
192
-
│ └── <script>.*
193
-
├── conf
194
-
│ └── profiles.config # Configuration profiles for compute environments
0 commit comments