Skip to content

Commit 9cb1d68

Browse files
authored
Merge pull request #13 from Aratz/multiqc_multireport
Generate reports per run, per project and per lane
2 parents e93baf9 + 02affeb commit 9cb1d68

22 files changed

+532
-71
lines changed

.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,3 +6,5 @@ results/
66
testing/
77
testing*
88
*.pyc
9+
.nf-test
10+
.nf-test.log

CHANGELOG.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,8 @@ Initial release of nf-core/seqinspector, created with the [nf-core](https://nf-c
99

1010
### `Added`
1111

12+
- [#13](https://github.com/nf-core/seqinspector/pull/13) Generate reports per run, per project and per lane.
13+
1214
### `Fixed`
1315

1416
### `Dependencies`

README.md

Lines changed: 4 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -39,26 +39,19 @@
3939
> [!NOTE]
4040
> If you are new to Nextflow and nf-core, please refer to [this page](https://nf-co.re/docs/usage/installation) on how to set-up Nextflow. Make sure to [test your setup](https://nf-co.re/docs/usage/introduction#how-to-run-a-pipeline) with `-profile test` before running the workflow on actual data.
4141
42-
<!-- TODO nf-core: Describe the minimum required steps to execute the pipeline, e.g. how to prepare samplesheets.
43-
Explain what rows and columns represent. For instance (please edit as appropriate):
44-
4542
First, prepare a samplesheet with your input data that looks as follows:
4643

4744
`samplesheet.csv`:
4845

4946
```csv
50-
sample,fastq_1,fastq_2
51-
CONTROL_REP1,AEG588A1_S1_L002_R1_001.fastq.gz,AEG588A1_S1_L002_R2_001.fastq.gz
47+
sample,lane,group,fastq_1,fastq_2,rundir
48+
CONTROL_REP1,1,GROUP1,AEG588A1_S1_L002_R1_001.fastq.gz,AEG588A1_S1_L002_R2_001.fastq.gz,200624_A00834_0183_BHMTFYDRXX
5249
```
5350

5451
Each row represents a fastq file (single-end) or a pair of fastq files (paired end).
5552

56-
-->
57-
5853
Now, you can run the pipeline using:
5954

60-
<!-- TODO nf-core: update the following command to include all required parameters for a minimal example -->
61-
6255
```bash
6356
nextflow run nf-core/seqinspector \
6457
-profile <docker/singularity/.../institute> \
@@ -80,11 +73,11 @@ For more details about the output files and reports, please refer to the
8073

8174
## Credits
8275

83-
nf-core/seqinspector was originally written by Adrien Coulier.
76+
nf-core/seqinspector was originally written by the Swedish [@NationalGenomicsInfrastructure](https://github.com/NationalGenomicsInfrastructure/).
8477

8578
We thank the following people for their extensive assistance in the development of this pipeline:
8679

87-
<!-- TODO nf-core: If applicable, make list of people who have also contributed -->
80+
- [@mahesh-panchal](https://github.com/mahesh-panchal)
8881

8982
## Contributions and Support
9083

assets/samplesheet.csv

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,3 @@
1-
sample,lane,project,fastq_1,fastq_2,rundir
1+
sample,lane,group,fastq_1,fastq_2,rundir
22
SAMPLE_PAIRED_END,1,P001,/path/to/fastq/files/AEG588A1_S1_L002_R1_001.fastq.gz,/path/to/fastq/files/AEG588A1_S1_L002_R2_001.fastq.gz,/path/to/rundir
33
SAMPLE_SINGLE_END,2,P002,/path/to/fastq/files/AEG588A4_S4_L003_R1_001.fastq.gz,,/path/to/rundir

assets/schema_input.json

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -19,11 +19,11 @@
1919
"errorMessage": "Lane ID must be a number",
2020
"meta": ["lane"]
2121
},
22-
"project": {
22+
"group": {
2323
"type": "string",
2424
"pattern": "^\\S+$",
25-
"errorMessage": "Project ID cannot contain spaces",
26-
"meta": ["project"]
25+
"errorMessage": "Group ID cannot contain spaces",
26+
"meta": ["group"]
2727
},
2828
"fastq_1": {
2929
"type": "string",
@@ -47,7 +47,7 @@
4747
"meta": ["rundir"]
4848
}
4949
},
50-
"required": ["sample", "lane", "fastq_1"],
50+
"required": ["sample", "fastq_1"],
5151
"dependentRequired": {
5252
"fastq_2": ["fastq_1"]
5353
}

conf/modules.config

Lines changed: 75 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ process {
2222
ext.args = '--quiet'
2323
}
2424

25-
withName: 'MULTIQC' {
25+
withName: 'MULTIQC_GLOBAL' {
2626
ext.args = { params.multiqc_title ? "--title \"$params.multiqc_title\"" : '' }
2727
publishDir = [
2828
path: { "${params.outdir}/multiqc" },
@@ -31,4 +31,78 @@ process {
3131
]
3232
}
3333

34+
withName: 'MULTIQC_PER_LANE' {
35+
ext.args = { params.multiqc_title ? "--title \"$params.multiqc_title\"" : '' }
36+
publishDir = [
37+
path: { "${params.outdir}/multiqc/lanes" },
38+
mode: params.publish_dir_mode,
39+
saveAs: {
40+
filename ->
41+
switch (filename) {
42+
case 'versions.yml':
43+
null
44+
break
45+
case ~/\[LANE:\d+\]_multiqc_(report\.html|plots|data)/:
46+
def lane = (filename =~ /\[LANE:(\d+)\]_multiqc_(report\.html|plots|data)/)[0][1]
47+
def new_filename = filename.replaceFirst(
48+
"(?<prefix>.*)\\[LANE:${lane}\\]_(?<suffix>multiqc_(report\\.html|plots|data).*)",
49+
'${prefix}${suffix}')
50+
"L${lane}/${new_filename}"
51+
break
52+
default:
53+
filename
54+
}
55+
}
56+
]
57+
}
58+
59+
withName: 'MULTIQC_PER_GROUP' {
60+
ext.args = { params.multiqc_title ? "--title \"$params.multiqc_title\"" : '' }
61+
publishDir = [
62+
path: { "${params.outdir}/multiqc/groups" },
63+
mode: params.publish_dir_mode,
64+
saveAs: {
65+
filename ->
66+
switch (filename) {
67+
case 'versions.yml':
68+
null
69+
break
70+
case ~/\[GROUP:.+\]_multiqc_(report\.html|plots|data)/:
71+
def group = (filename =~ /\[GROUP:(.+)\]_multiqc_(report\.html|plots|data)/)[0][1]
72+
def new_filename = filename.replaceFirst(
73+
"(?<prefix>.*)\\[GROUP:${group}\\]_(?<suffix>multiqc_(report\\.html|plots|data).*)",
74+
'${prefix}${suffix}')
75+
"${group}/${new_filename}"
76+
break
77+
default:
78+
filename
79+
}
80+
}
81+
]
82+
}
83+
84+
withName: 'MULTIQC_PER_RUNDIR' {
85+
ext.args = { params.multiqc_title ? "--title \"$params.multiqc_title\"" : '' }
86+
publishDir = [
87+
path: { "${params.outdir}/multiqc/rundirss" },
88+
mode: params.publish_dir_mode,
89+
saveAs: {
90+
filename ->
91+
switch (filename) {
92+
case 'versions.yml':
93+
null
94+
break
95+
case ~/\[RUNDIR:.+\]_multiqc_(report\.html|plots|data)/:
96+
def rundir = (filename =~ /\[RUNDIR:(.+)\]_multiqc_(report\.html|plots|data)/)[0][1]
97+
def new_filename = filename.replaceFirst(
98+
"(?<prefix>.*)\\[RUNDIR:${rundir}\\]_(?<suffix>multiqc_(report\\.html|plots|data).*)",
99+
'${prefix}${suffix}')
100+
"${rundir}/${new_filename}"
101+
break
102+
default:
103+
filename
104+
}
105+
}
106+
]
107+
}
34108
}

docs/output.md

Lines changed: 23 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,8 +6,6 @@ This document describes the output produced by the pipeline. Most of the plots a
66

77
The directories listed below will be created in the results directory after the pipeline has finished. All paths are relative to the top-level results directory.
88

9-
<!-- TODO nf-core: Write this documentation describing your workflow's output -->
10-
119
## Pipeline overview
1210

1311
The pipeline is built using [Nextflow](https://www.nextflow.io/) and processes data using the following steps:
@@ -48,6 +46,29 @@ The FastQC plots displayed in the MultiQC report shows _untrimmed_ reads. They m
4846
- `multiqc_report.html`: a standalone HTML file that can be viewed in your web browser.
4947
- `multiqc_data/`: directory containing parsed statistics from the different tools used in the pipeline.
5048
- `multiqc_plots/`: directory containing static images from the report in various formats.
49+
- `lanes/` [1]
50+
- `L1/`
51+
- `multiqc_report.html`: a standalone HTML file that can be viewed in your web browser.
52+
- `multiqc_data/`: directory containing parsed statistics from the different tools used in the pipeline.
53+
- `multiqc_plots/`: directory containing static images from the report in various formats.
54+
- `L2/`
55+
- ...
56+
- `groups/` [1]
57+
- `GROUPNAME1/`
58+
- `multiqc_report.html`: a standalone HTML file that can be viewed in your web browser.
59+
- `multiqc_data/`: directory containing parsed statistics from the different tools used in the pipeline.
60+
- `multiqc_plots/`: directory containing static images from the report in various formats.
61+
- `GROUPNAME2/`
62+
- ...
63+
- `rundir/` [1]
64+
- `RUNDIR1/`
65+
- `multiqc_report.html`: a standalone HTML file that can be viewed in your web browser.
66+
- `multiqc_data/`: directory containing parsed statistics from the different tools used in the pipeline.
67+
- `multiqc_plots/`: directory containing static images from the report in various formats.
68+
- `RUNDIR2/`
69+
- ...
70+
71+
[1] These files will only be generated if `lane`, `group` or `rundir` were specified for some samples.
5172

5273
</details>
5374

docs/usage.md

Lines changed: 21 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -10,47 +10,45 @@
1010

1111
## Samplesheet input
1212

13-
You will need to create a samplesheet with information about the samples you would like to analyse before running the pipeline. Use this parameter to specify its location. It has to be a comma-separated file with 3 columns, and a header row as shown in the examples below.
13+
You will need to create a samplesheet with information about the samples you would like to analyse before running the pipeline. Use this parameter to specify its location.
1414

1515
```bash
1616
--input '[path to samplesheet file]'
1717
```
1818

19-
### Multiple runs of the same sample
19+
### Full samplesheet
2020

21-
The `sample` identifiers have to be the same when you have re-sequenced the same sample more than once e.g. to increase sequencing depth. The pipeline will concatenate the raw reads before performing any downstream analysis. Below is an example for the same sample sequenced across 3 lanes:
21+
The following simple run dir structure...
2222

23-
```csv title="samplesheet.csv"
24-
sample,fastq_1,fastq_2
25-
CONTROL_REP1,AEG588A1_S1_L002_R1_001.fastq.gz,AEG588A1_S1_L002_R2_001.fastq.gz
26-
CONTROL_REP1,AEG588A1_S1_L003_R1_001.fastq.gz,AEG588A1_S1_L003_R2_001.fastq.gz
27-
CONTROL_REP1,AEG588A1_S1_L004_R1_001.fastq.gz,AEG588A1_S1_L004_R2_001.fastq.gz
23+
```
24+
run_dir
25+
├── sample1_lane1_group1_r1.fq.gz
26+
├── sample2_lane1_group1_r1.fq.gz
27+
├── sample3_lane2_group2_r1.fq.gz
28+
└── sample4_lane2_group3_r1.fq.gz
2829
```
2930

30-
### Full samplesheet
31-
32-
The pipeline will auto-detect whether a sample is single- or paired-end using the information provided in the samplesheet. The samplesheet can have as many columns as you desire, however, there is a strict requirement for the first 3 columns to match those defined in the table below.
33-
34-
A final samplesheet file consisting of both single- and paired-end data may look something like the one below. This is for 6 samples, where `TREATMENT_REP3` has been sequenced twice.
31+
...would be represented in the following samplesheet (shown as .tsv for readability)
3532

3633
```csv title="samplesheet.csv"
37-
sample,fastq_1,fastq_2
38-
CONTROL_REP1,AEG588A1_S1_L002_R1_001.fastq.gz,AEG588A1_S1_L002_R2_001.fastq.gz
39-
CONTROL_REP2,AEG588A2_S2_L002_R1_001.fastq.gz,AEG588A2_S2_L002_R2_001.fastq.gz
40-
CONTROL_REP3,AEG588A3_S3_L002_R1_001.fastq.gz,AEG588A3_S3_L002_R2_001.fastq.gz
41-
TREATMENT_REP1,AEG588A4_S4_L003_R1_001.fastq.gz,
42-
TREATMENT_REP2,AEG588A5_S5_L003_R1_001.fastq.gz,
43-
TREATMENT_REP3,AEG588A6_S6_L003_R1_001.fastq.gz,
44-
TREATMENT_REP3,AEG588A6_S6_L004_R1_001.fastq.gz,
34+
sample lane group fastq_1 fastq_2 rundir
35+
sample1 1 group1 path/to/run_dir/sample1_lane1_group1_r1.fq.gz path/to/run_dir
36+
sample2 1 group1 path/to/run_dir/sample2_lane1_group1_r1.fq.gz path/to/run_dir
37+
sample3 2 group2 path/to/run_dir/sample3_lane2_group2_r1.fq.gz path/to/run_dir
38+
sample4 2 group3 path/to/run_dir/sample4_lane2_group3_r1.fq.gz path/to/run_dir
39+
4540
```
4641

4742
| Column | Description |
4843
| --------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
4944
| `sample` | Custom sample name. This entry will be identical for multiple sequencing libraries/runs from the same sample. Spaces in sample names are automatically converted to underscores (`_`). |
45+
| `lane` | Lane where the sample was processed on an Illumina instrument (optional). |
46+
| `group` | Group the sample belongs too, useful when several groups are pooled together (optional). |
5047
| `fastq_1` | Full path to FastQ file for Illumina short reads 1. File has to be gzipped and have the extension ".fastq.gz" or ".fq.gz". |
51-
| `fastq_2` | Full path to FastQ file for Illumina short reads 2. File has to be gzipped and have the extension ".fastq.gz" or ".fq.gz". |
48+
| `fastq_2` | Full path to FastQ file for Illumina short reads 2. File has to be gzipped and have the extension ".fastq.gz" or ".fq.gz" (optional). |
49+
| `rundir` | Path to the runfolder containing extra information about the sequencing run (optional) . |
5250

53-
An [example samplesheet](../assets/samplesheet.csv) has been provided with the pipeline.
51+
Another [example samplesheet](../assets/samplesheet.csv) has been provided with the pipeline.
5452

5553
## Running the pipeline
5654

main.nf

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -58,7 +58,10 @@ workflow NFCORE_SEQINSPECTOR {
5858
)
5959

6060
emit:
61-
multiqc_report = SEQINSPECTOR.out.multiqc_report // channel: /path/to/multiqc_report.html
61+
global_report = SEQINSPECTOR.out.global_report // channel: /path/to/multiqc_report.html
62+
lane_reports = SEQINSPECTOR.out.lane_reports // channel: /path/to/multiqc_report.html
63+
group_reports = SEQINSPECTOR.out.group_reports // channel: /path/to/multiqc_report.html
64+
rundir_report = SEQINSPECTOR.out.rundir_reports // channel: /path/to/multiqc_report.html
6265

6366
}
6467
/*
@@ -101,7 +104,7 @@ workflow {
101104
params.outdir,
102105
params.monochrome_logs,
103106
params.hook_url,
104-
NFCORE_SEQINSPECTOR.out.multiqc_report
107+
NFCORE_SEQINSPECTOR.out.global_report,
105108
)
106109
}
107110

nf-test.config

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
config {
2+
3+
testsDir "tests"
4+
workDir ".nf-test"
5+
configFile "tests/nextflow.config"
6+
profile "test,docker"
7+
8+
}

0 commit comments

Comments
 (0)