Skip to content

Commit 5bea1e5

Browse files
authored
Merge pull request #20 from Aratz/dev
Implement tagging system
2 parents 9cb1d68 + 1f33928 commit 5bea1e5

24 files changed

+232
-330
lines changed

CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@ Initial release of nf-core/seqinspector, created with the [nf-core](https://nf-c
99

1010
### `Added`
1111

12+
- [#20](https://github.com/nf-core/seqinspector/pull/20) Use tags to generate group reports
1213
- [#13](https://github.com/nf-core/seqinspector/pull/13) Generate reports per run, per project and per lane.
1314

1415
### `Fixed`

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -44,8 +44,8 @@ First, prepare a samplesheet with your input data that looks as follows:
4444
`samplesheet.csv`:
4545

4646
```csv
47-
sample,lane,group,fastq_1,fastq_2,rundir
48-
CONTROL_REP1,1,GROUP1,AEG588A1_S1_L002_R1_001.fastq.gz,AEG588A1_S1_L002_R2_001.fastq.gz,200624_A00834_0183_BHMTFYDRXX
47+
sample,fastq_1,fastq_2,rundir,tags
48+
CONTROL_REP1,AEG588A1_S1_L002_R1_001.fastq.gz,AEG588A1_S1_L002_R2_001.fastq.gz,200624_A00834_0183_BHMTFYDRXX,lane1:project5:group2
4949
```
5050

5151
Each row represents a fastq file (single-end) or a pair of fastq files (paired end).

assets/samplesheet.csv

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,7 @@
1-
sample,lane,group,fastq_1,fastq_2,rundir
2-
SAMPLE_PAIRED_END,1,P001,/path/to/fastq/files/AEG588A1_S1_L002_R1_001.fastq.gz,/path/to/fastq/files/AEG588A1_S1_L002_R2_001.fastq.gz,/path/to/rundir
3-
SAMPLE_SINGLE_END,2,P002,/path/to/fastq/files/AEG588A4_S4_L003_R1_001.fastq.gz,,/path/to/rundir
1+
sample,fastq_1,fastq_2,rundir,tags
2+
SAMPLE_PAIRED_END,/path/to/fastq/files/AEG588A1_S1_L002_R1_001.fastq.gz,/path/to/fastq/files/AEG588A1_S1_L002_R2_001.fastq.gz,/path/to/rundir,paired_sample:lane1
3+
SAMPLE_PAIRED_END,/path/to/fastq/files/AEG588A2_S2_L002_R1_001.fastq.gz,/path/to/fastq/files/AEG588A2_S2_L002_R2_001.fastq.gz,/path/to/rundir,paired_sample:lane1
4+
SAMPLE_PAIRED_END,/path/to/fastq/files/AEG588A3_S3_L002_R1_001.fastq.gz,/path/to/fastq/files/AEG588A3_S3_L002_R2_001.fastq.gz,/path/to/rundir,paired_sample:lane2
5+
SAMPLE_SINGLE_END,/path/to/fastq/files/AEG588A4_S4_L003_R1_001.fastq.gz,,/path/to/rundir,group1
6+
SAMPLE_SINGLE_END,/path/to/fastq/files/AEG588A4_S4_L003_R1_001.fastq.gz,,/path/to/rundir,group2
7+
SAMPLE_SINGLE_END,/path/to/fastq/files/AEG588A4_S4_L003_R1_001.fastq.gz,,/path/to/rundir,group3

assets/schema_input.json

Lines changed: 6 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -13,18 +13,6 @@
1313
"errorMessage": "Sample name must be provided and cannot contain spaces",
1414
"meta": ["sample"]
1515
},
16-
"lane": {
17-
"type": "integer",
18-
"pattern": "^\\d+$",
19-
"errorMessage": "Lane ID must be a number",
20-
"meta": ["lane"]
21-
},
22-
"group": {
23-
"type": "string",
24-
"pattern": "^\\S+$",
25-
"errorMessage": "Group ID cannot contain spaces",
26-
"meta": ["group"]
27-
},
2816
"fastq_1": {
2917
"type": "string",
3018
"format": "file-path",
@@ -45,6 +33,12 @@
4533
"exists": true,
4634
"errorMessage": "Run directory must be a path",
4735
"meta": ["rundir"]
36+
},
37+
"tags": {
38+
"type": "string",
39+
"pattern": "^([A-Za-z0-9_-]+:)*([A-Za-z0-9_-]+)$",
40+
"errorMessage": "Tags must be separated by colons and only consist of lowercase letters, numbers, underscores and hyphens.",
41+
"meta": ["tags"]
4842
}
4943
},
5044
"required": ["sample", "fastq_1"],

conf/modules.config

Lines changed: 7 additions & 57 deletions
Original file line numberDiff line numberDiff line change
@@ -25,79 +25,29 @@ process {
2525
withName: 'MULTIQC_GLOBAL' {
2626
ext.args = { params.multiqc_title ? "--title \"$params.multiqc_title\"" : '' }
2727
publishDir = [
28-
path: { "${params.outdir}/multiqc" },
28+
path: { "${params.outdir}/multiqc/global_report" },
2929
mode: params.publish_dir_mode,
3030
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
3131
]
3232
}
3333

34-
withName: 'MULTIQC_PER_LANE' {
34+
withName: 'MULTIQC_PER_TAG' {
3535
ext.args = { params.multiqc_title ? "--title \"$params.multiqc_title\"" : '' }
3636
publishDir = [
37-
path: { "${params.outdir}/multiqc/lanes" },
37+
path: { "${params.outdir}/multiqc/group_reports" },
3838
mode: params.publish_dir_mode,
3939
saveAs: {
4040
filename ->
4141
switch (filename) {
4242
case 'versions.yml':
4343
null
4444
break
45-
case ~/\[LANE:\d+\]_multiqc_(report\.html|plots|data)/:
46-
def lane = (filename =~ /\[LANE:(\d+)\]_multiqc_(report\.html|plots|data)/)[0][1]
45+
case ~/\[TAG:.+\]_multiqc_(report\.html|plots|data)/:
46+
def tag = (filename =~ /\[TAG:(.+)\]_multiqc_(report\.html|plots|data)/)[0][1]
4747
def new_filename = filename.replaceFirst(
48-
"(?<prefix>.*)\\[LANE:${lane}\\]_(?<suffix>multiqc_(report\\.html|plots|data).*)",
48+
"(?<prefix>.*)\\[TAG:${tag}\\]_(?<suffix>multiqc_(report\\.html|plots|data).*)",
4949
'${prefix}${suffix}')
50-
"L${lane}/${new_filename}"
51-
break
52-
default:
53-
filename
54-
}
55-
}
56-
]
57-
}
58-
59-
withName: 'MULTIQC_PER_GROUP' {
60-
ext.args = { params.multiqc_title ? "--title \"$params.multiqc_title\"" : '' }
61-
publishDir = [
62-
path: { "${params.outdir}/multiqc/groups" },
63-
mode: params.publish_dir_mode,
64-
saveAs: {
65-
filename ->
66-
switch (filename) {
67-
case 'versions.yml':
68-
null
69-
break
70-
case ~/\[GROUP:.+\]_multiqc_(report\.html|plots|data)/:
71-
def group = (filename =~ /\[GROUP:(.+)\]_multiqc_(report\.html|plots|data)/)[0][1]
72-
def new_filename = filename.replaceFirst(
73-
"(?<prefix>.*)\\[GROUP:${group}\\]_(?<suffix>multiqc_(report\\.html|plots|data).*)",
74-
'${prefix}${suffix}')
75-
"${group}/${new_filename}"
76-
break
77-
default:
78-
filename
79-
}
80-
}
81-
]
82-
}
83-
84-
withName: 'MULTIQC_PER_RUNDIR' {
85-
ext.args = { params.multiqc_title ? "--title \"$params.multiqc_title\"" : '' }
86-
publishDir = [
87-
path: { "${params.outdir}/multiqc/rundirss" },
88-
mode: params.publish_dir_mode,
89-
saveAs: {
90-
filename ->
91-
switch (filename) {
92-
case 'versions.yml':
93-
null
94-
break
95-
case ~/\[RUNDIR:.+\]_multiqc_(report\.html|plots|data)/:
96-
def rundir = (filename =~ /\[RUNDIR:(.+)\]_multiqc_(report\.html|plots|data)/)[0][1]
97-
def new_filename = filename.replaceFirst(
98-
"(?<prefix>.*)\\[RUNDIR:${rundir}\\]_(?<suffix>multiqc_(report\\.html|plots|data).*)",
99-
'${prefix}${suffix}')
100-
"${rundir}/${new_filename}"
50+
"${tag}/${new_filename}"
10151
break
10252
default:
10353
filename

conf/test.config

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ params {
2222
// Input data
2323
// TODO nf-core: Specify the paths to your test data on nf-core/test-datasets
2424
// TODO nf-core: Give any required params for the test so that command line flags are not needed
25-
input = params.pipelines_testdata_base_path + 'seqinspector/testdata/MiSeq/samplesheet.csv'
25+
input = params.pipelines_testdata_base_path + 'seqinspector/testdata/NovaSeq6000/samplesheet.csv'
2626

2727
// Genome references
2828
genome = 'R64-1-1'

docs/output.md

Lines changed: 19 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -39,36 +39,29 @@ The FastQC plots displayed in the MultiQC report shows _untrimmed_ reads. They m
3939

4040
### MultiQC
4141

42+
nf-core/seqinspector will generate the following MultiQC reports:
43+
44+
- one global reports including all the samples listed in the samplesheet
45+
- one group report per unique tag. These reports compile samples that share the same tag.
46+
4247
<details markdown="1">
4348
<summary>Output files</summary>
4449

4550
- `multiqc/`
46-
- `multiqc_report.html`: a standalone HTML file that can be viewed in your web browser.
47-
- `multiqc_data/`: directory containing parsed statistics from the different tools used in the pipeline.
48-
- `multiqc_plots/`: directory containing static images from the report in various formats.
49-
- `lanes/` [1]
50-
- `L1/`
51-
- `multiqc_report.html`: a standalone HTML file that can be viewed in your web browser.
52-
- `multiqc_data/`: directory containing parsed statistics from the different tools used in the pipeline.
53-
- `multiqc_plots/`: directory containing static images from the report in various formats.
54-
- `L2/`
55-
- ...
56-
- `groups/` [1]
57-
- `GROUPNAME1/`
58-
- `multiqc_report.html`: a standalone HTML file that can be viewed in your web browser.
59-
- `multiqc_data/`: directory containing parsed statistics from the different tools used in the pipeline.
60-
- `multiqc_plots/`: directory containing static images from the report in various formats.
61-
- `GROUPNAME2/`
62-
- ...
63-
- `rundir/` [1]
64-
- `RUNDIR1/`
65-
- `multiqc_report.html`: a standalone HTML file that can be viewed in your web browser.
66-
- `multiqc_data/`: directory containing parsed statistics from the different tools used in the pipeline.
67-
- `multiqc_plots/`: directory containing static images from the report in various formats.
68-
- `RUNDIR2/`
69-
- ...
70-
71-
[1] These files will only be generated if `lane`, `group` or `rundir` were specified for some samples.
51+
- `global_report`
52+
- `multiqc_report.html`: a standalone HTML file that can be viewed in your web browser.
53+
- `multiqc_data/`: directory containing parsed statistics from the different tools used in the pipeline.
54+
- `multiqc_plots/`: directory containing static images from the report in various formats.
55+
- `group_reports`
56+
- `tag1/`
57+
- `multiqc_report.html`
58+
- `multiqc_data/`
59+
- `multiqc_plots/`
60+
- `tag2/`
61+
- `multiqc_report.html`
62+
- `multiqc_data/`
63+
- `multiqc_plots/`
64+
- ...
7265

7366
</details>
7467

docs/usage.md

Lines changed: 7 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -31,22 +31,21 @@ run_dir
3131
...would be represented in the following samplesheet (shown as .tsv for readability)
3232

3333
```csv title="samplesheet.csv"
34-
sample lane group fastq_1 fastq_2 rundir
35-
sample1 1 group1 path/to/run_dir/sample1_lane1_group1_r1.fq.gz path/to/run_dir
36-
sample2 1 group1 path/to/run_dir/sample2_lane1_group1_r1.fq.gz path/to/run_dir
37-
sample3 2 group2 path/to/run_dir/sample3_lane2_group2_r1.fq.gz path/to/run_dir
38-
sample4 2 group3 path/to/run_dir/sample4_lane2_group3_r1.fq.gz path/to/run_dir
34+
sample fastq_1 fastq_2 rundir tags
35+
sample1 path/to/run_dir/sample1_lane1_group1_r1.fq.gz path/to/run_dir project1:group1
36+
sample2 path/to/run_dir/sample2_lane1_group1_r1.fq.gz path/to/run_dir project1:group1
37+
sample3 path/to/run_dir/sample3_lane2_group2_r1.fq.gz path/to/run_dir project1:group2
38+
sample4 path/to/run_dir/sample4_lane2_group3_r1.fq.gz path/to/run_dir control
3939
4040
```
4141

4242
| Column | Description |
4343
| --------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
4444
| `sample` | Custom sample name. This entry will be identical for multiple sequencing libraries/runs from the same sample. Spaces in sample names are automatically converted to underscores (`_`). |
45-
| `lane` | Lane where the sample was processed on an Illumina instrument (optional). |
46-
| `group` | Group the sample belongs too, useful when several groups are pooled together (optional). |
4745
| `fastq_1` | Full path to FastQ file for Illumina short reads 1. File has to be gzipped and have the extension ".fastq.gz" or ".fq.gz". |
4846
| `fastq_2` | Full path to FastQ file for Illumina short reads 2. File has to be gzipped and have the extension ".fastq.gz" or ".fq.gz" (optional). |
49-
| `rundir` | Path to the runfolder containing extra information about the sequencing run (optional) . |
47+
| `rundir` | Path to the runfolder containing extra information about the sequencing run (optional). |
48+
| `tags` | Colon-separated list of tags to group samples in special reports. |
5049

5150
Another [example samplesheet](../assets/samplesheet.csv) has been provided with the pipeline.
5251

main.nf

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -58,10 +58,8 @@ workflow NFCORE_SEQINSPECTOR {
5858
)
5959

6060
emit:
61-
global_report = SEQINSPECTOR.out.global_report // channel: /path/to/multiqc_report.html
62-
lane_reports = SEQINSPECTOR.out.lane_reports // channel: /path/to/multiqc_report.html
63-
group_reports = SEQINSPECTOR.out.group_reports // channel: /path/to/multiqc_report.html
64-
rundir_report = SEQINSPECTOR.out.rundir_reports // channel: /path/to/multiqc_report.html
61+
global_report = SEQINSPECTOR.out.global_report // channel: /path/to/multiqc_report.html
62+
grouped_reports = SEQINSPECTOR.out.grouped_reports // channel: /path/to/multiqc_report.html
6563

6664
}
6765
/*

modules.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@
1212
},
1313
"multiqc": {
1414
"branch": "master",
15-
"git_sha": "b7ebe95761cd389603f9cc0e0dc384c0f663815a",
15+
"git_sha": "19ca321db5d8bd48923262c2eca6422359633491",
1616
"installed_by": ["modules"]
1717
}
1818
}

0 commit comments

Comments
 (0)