|
10 | 10 |
|
11 | 11 | ## Samplesheet input |
12 | 12 |
|
13 | | -You will need to create a samplesheet with information about the samples you would like to analyse before running the pipeline. Use this parameter to specify its location. It has to be a comma-separated file with 3 columns, and a header row as shown in the examples below. |
| 13 | +You will need to create a samplesheet with information about the samples you would like to analyse before running the pipeline. Use this parameter to specify its location. |
14 | 14 |
|
15 | 15 | ```bash |
16 | 16 | --input '[path to samplesheet file]' |
17 | 17 | ``` |
18 | 18 |
|
19 | 19 | ### Full samplesheet |
20 | 20 |
|
| 21 | +The following simple run dir structure... |
| 22 | + |
| 23 | +``` |
| 24 | +run_dir |
| 25 | +├── sample1_lane1_group1_r1.fq.gz |
| 26 | +├── sample2_lane1_group1_r1.fq.gz |
| 27 | +├── sample3_lane2_group2_r1.fq.gz |
| 28 | +└── sample4_lane2_group3_r1.fq.gz |
| 29 | +``` |
| 30 | + |
| 31 | +...would be represented in the following samplesheet (shown as .tsv for readability) |
| 32 | + |
21 | 33 | ```csv title="samplesheet.csv" |
22 | | -sample,lane,group,fastq_1,fastq_2,rundir |
23 | | -CONTROL_REP1,1,,AEG588A1_S1_L002_R1_001.fastq.gz,AEG588A1_S1_L002_R2_001.fastq.gz,200624_A00834_0183_BHMTFYDRXX |
24 | | -CONTROL_REP2,1,,AEG588A2_S2_L002_R1_001.fastq.gz,AEG588A2_S2_L002_R2_001.fastq.gz,200624_A00834_0183_BHMTFYDRXX |
25 | | -CONTROL_REP3,1,,AEG588A3_S3_L002_R1_001.fastq.gz,AEG588A3_S3_L002_R2_001.fastq.gz,200624_A00834_0183_BHMTFYDRXX |
26 | | -TREATMENT_REP1,2,GROUP1,AEG588A4_S4_L003_R1_001.fastq.gz,,200624_A00834_0183_BHMTFYDRXX |
27 | | -TREATMENT_REP2,2,GROUP1,AEG588A5_S5_L003_R1_001.fastq.gz,,200624_A00834_0183_BHMTFYDRXX |
28 | | -TREATMENT_REP3,2,GROUP2,AEG588A6_S6_L003_R1_001.fastq.gz,,200624_A00834_0183_BHMTFYDRXX |
29 | | -TREATMENT_REP3,2,GROUP2,AEG588A6_S6_L004_R1_001.fastq.gz,,200624_A00834_0183_BHMTFYDRXX |
| 34 | +sample lane group fastq_1 fastq_2 rundir |
| 35 | +sample1 1 group1 path/to/run_dir/sample1_lane1_group1_r1.fq.gz path/to/run_dir |
| 36 | +sample2 1 group1 path/to/run_dir/sample2_lane1_group1_r1.fq.gz path/to/run_dir |
| 37 | +sample3 2 group2 path/to/run_dir/sample3_lane2_group2_r1.fq.gz path/to/run_dir |
| 38 | +sample4 2 group3 path/to/run_dir/sample4_lane2_group3_r1.fq.gz path/to/run_dir |
| 39 | +
|
30 | 40 | ``` |
31 | 41 |
|
32 | 42 | | Column | Description | |
33 | 43 | | --------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | |
34 | 44 | | `sample` | Custom sample name. This entry will be identical for multiple sequencing libraries/runs from the same sample. Spaces in sample names are automatically converted to underscores (`_`). | |
35 | 45 | | `lane` | Lane where the sample was processed on an Illumina instrument (optional). | |
36 | 46 | | `group` | Group the sample belongs too, useful when several groups are pooled together (optional). | |
37 | | -| `rundir` | Path to the runfolder containing extra information about the sequencing run (optional) . | |
38 | 47 | | `fastq_1` | Full path to FastQ file for Illumina short reads 1. File has to be gzipped and have the extension ".fastq.gz" or ".fq.gz". | |
39 | 48 | | `fastq_2` | Full path to FastQ file for Illumina short reads 2. File has to be gzipped and have the extension ".fastq.gz" or ".fq.gz" (optional). | |
| 49 | +| `rundir` | Path to the runfolder containing extra information about the sequencing run (optional) . | |
40 | 50 |
|
41 | | -An [example samplesheet](../assets/samplesheet.csv) has been provided with the pipeline. |
| 51 | +Another [example samplesheet](../assets/samplesheet.csv) has been provided with the pipeline. |
42 | 52 |
|
43 | 53 | ## Running the pipeline |
44 | 54 |
|
|
0 commit comments