Merge pull request #134 from nf-core/fixes

drpatelh · web-flow · commit 3bfe31c8829c · 2022-12-20T21:59:55.000Z
Add '--nf_core_rnaseq_strandedness' parameter and support for nf-core/atacseq
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -3,11 +3,14 @@
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 
-## [Unpublished Version / DEV]
+## [[1.9](https://github.com/nf-core/rnaseq/releases/tag/1.9)] - 2022-12-21
 
 ### Enhancements & fixes
 
+- Bumped minimum Nextflow version from `21.10.3` -> `22.10.1`
 - Updated pipeline template to [nf-core/tools 2.7.2](https://github.com/nf-core/tools/releases/tag/2.7.2)
+- Added support for generating nf-core/atacseq compatible samplesheets
+- Added `--nf_core_rnaseq_strandedness` parameter to specify value for `strandedness` entry added to samplesheet created when using `--nf_core_pipeline rnaseq`. The default is `auto` which can be used with nf-core/rnaseq v3.10 onwards to auto-detect strandedness during the pipeline execution.
 
 ## [[1.8](https://github.com/nf-core/fetchngs/releases/tag/1.8)] - 2022-11-08
 
diff --git a/README.md b/README.md
@@ -55,10 +55,11 @@ This downloads a text file called `SRR_Acc_List.txt` that can be directly provid
 The columns in the auto-created samplesheet can be tailored to be accepted out-of-the-box by selected nf-core pipelines, these currently include:
 
 - [nf-core/rnaseq](https://nf-co.re/rnaseq/usage#samplesheet-input)
+- [nf-core/atacseq](https://nf-co.re/atacseq/usage#samplesheet-input)
 - Ilumina processing mode of [nf-core/viralrecon](https://nf-co.re/viralrecon/usage#illumina-samplesheet-format)
 - [nf-core/taxprofiler](https://nf-co.re/nf-core/taxprofiler)
 
-You can use the `--nf_core_pipeline` parameter to customise this behaviour e.g. `--nf_core_pipeline rnaseq`. More pipelines will be supported in due course as we adopt and standardise samplesheet input across nf-core.
+See [usage docs](https://nf-co.re/fetchngs/1.8/usage#samplesheet-format) for more details.
 
 ## Quick Start
 
diff --git a/docs/usage.md b/docs/usage.md
@@ -58,10 +58,13 @@ The final sample information for the FastQ files used for samplesheet generation
 As a bonus, the columns in the auto-created samplesheet can be tailored to be accepted out-of-the-box by selected nf-core pipelines, these currently include:
 
 - [nf-core/rnaseq](https://nf-co.re/rnaseq/usage#samplesheet-input)
+- [nf-core/atacseq](https://nf-co.re/atacseq/usage#samplesheet-input)
 - Ilumina processing mode of [nf-core/viralrecon](https://nf-co.re/viralrecon/usage#illumina-samplesheet-format)
 - [nf-core/taxprofiler](https://nf-co.re/nf-core/taxprofiler)
 
-You can use the `--nf_core_pipeline` parameter to customise this behaviour e.g. `--nf_core_pipeline rnaseq`. More pipelines will be supported in due course as we adopt and standardise samplesheet input across nf-core. It is highly recommended that you double-check that all of the identifiers you defined using `--input` are represented in the samplesheet. Also, public databases don't reliably hold information such as strandedness information so you may need to amend these entries too if for example your samplesheet was created by providing `--nf_core_pipeline rnaseq`.
+You can use the `--nf_core_pipeline` parameter to customise this behaviour e.g. `--nf_core_pipeline rnaseq`. More pipelines will be supported in due course as we adopt and standardise samplesheet input across nf-core. It is highly recommended that you double-check that all of the identifiers required by the downstream nf-core pipeline are accurately represented in the samplesheet. For example, the nf-core/atacseq pipeline requires a `replicate` column to be provided in it's input samplehsheet, however, public databases don't reliably hold information regarding replicates so you may need to amend these entries if your samplesheet was created by providing `--nf_core_pipeline atacseq`.
+
+From v1.9 of this pipeline the default `strandedness` in the output samplesheet will be set to `auto` when using `--nf_core_pipeline rnaseq`. This will only work with v3.10 onwards of nf-core/rnaseq which permits the auto-detection of strandedness during the pipeline execution. You can change this behaviour with the `--nf_core_rnaseq_strandedness` parameter which is set to `auto` by default.
 
 ### Bypass `FTP` data download
 
diff --git a/modules/local/sra_to_samplesheet.nf b/modules/local/sra_to_samplesheet.nf
@@ -8,6 +8,7 @@ process SRA_TO_SAMPLESHEET {
     input:
     val meta
     val pipeline
+    val strandedness
     val mapping_fields
 
     output:
@@ -38,7 +39,9 @@ process SRA_TO_SAMPLESHEET {
     // Add nf-core pipeline specific entries
     if (pipeline) {
         if (pipeline == 'rnaseq') {
-            pipeline_map << [ strandedness: 'unstranded' ]
+            pipeline_map << [ strandedness: strandedness ]
+        } else if (pipeline == 'atacseq') {
+            pipeline_map << [ replicate: 1 ]
         } else if (pipeline == 'taxprofiler') {
             pipeline_map << [ fasta: '' ]
         }
diff --git a/modules/local/synapse_to_samplesheet.nf b/modules/local/synapse_to_samplesheet.nf
@@ -8,6 +8,7 @@ process SYNAPSE_TO_SAMPLESHEET {
     input:
     tuple val(meta), path(fastq)
     val pipeline
+    val strandedness
 
     output:
     tuple val(meta), path("*.csv"), emit: samplesheet
@@ -35,7 +36,11 @@ process SYNAPSE_TO_SAMPLESHEET {
     // Add nf-core pipeline specific entries
     if (pipeline) {
         if (pipeline == 'rnaseq') {
-            pipeline_map << [ strandedness: 'unstranded' ]
+            pipeline_map << [ strandedness: strandedness ]
+        } else if (pipeline == 'atacseq') {
+            pipeline_map << [ replicate: 1 ]
+        } else if (pipeline == 'taxprofiler') {
+            pipeline_map << [ fasta: '' ]
         }
     }
     pipeline_map << meta_map
diff --git a/nextflow.config b/nextflow.config
@@ -10,43 +10,44 @@
 params {
 
     // Input options
-    input                      = null
-    input_type                 = 'sra'
-    nf_core_pipeline           = null
-    ena_metadata_fields        = null
-    sample_mapping_fields      = 'experiment_accession,run_accession,sample_accession,experiment_alias,run_alias,sample_alias,experiment_title,sample_title,sample_description,description'
-    synapse_config             = null
-    force_sratools_download    = false
-    skip_fastq_download        = false
+    input                       = null
+    input_type                  = 'sra'
+    nf_core_pipeline            = null
+    nf_core_rnaseq_strandedness = 'auto'
+    ena_metadata_fields         = null
+    sample_mapping_fields       = 'experiment_accession,run_accession,sample_accession,experiment_alias,run_alias,sample_alias,experiment_title,sample_title,sample_description,description'
+    synapse_config              = null
+    force_sratools_download     = false
+    skip_fastq_download         = false
 
     // Boilerplate options
-    outdir                     = null
-    tracedir                   = "${params.outdir}/pipeline_info"
-    publish_dir_mode           = 'copy'
-    email                      = null
-    email_on_fail              = null
-    plaintext_email            = false
-    monochrome_logs            = false
-    hook_url                   = null
-    help                       = false
-    version                    = false
-    validate_params            = true
-    show_hidden_params         = false
-    schema_ignore_params       = 'genomes,igenomes_base'
+    outdir                      = null
+    tracedir                    = "${params.outdir}/pipeline_info"
+    publish_dir_mode            = 'copy'
+    email                       = null
+    email_on_fail               = null
+    plaintext_email             = false
+    monochrome_logs             = false
+    hook_url                    = null
+    help                        = false
+    version                     = false
+    validate_params             = true
+    show_hidden_params          = false
+    schema_ignore_params        = 'genomes,igenomes_base'
 
     // Config options
-    custom_config_version      = 'master'
-    custom_config_base         = "https://raw.githubusercontent.com/nf-core/configs/${params.custom_config_version}"
-    config_profile_description = null
-    config_profile_contact     = null
-    config_profile_url         = null
-    config_profile_name        = null
+    custom_config_version       = 'master'
+    custom_config_base          = "https://raw.githubusercontent.com/nf-core/configs/${params.custom_config_version}"
+    config_profile_description  = null
+    config_profile_contact      = null
+    config_profile_url          = null
+    config_profile_name         = null
 
     // Max resource options
     // Defaults only, expecting to be overwritten
-    max_memory                 = '128.GB'
-    max_cpus                   = 16
-    max_time                   = '240.h'
+    max_memory                  = '128.GB'
+    max_cpus                    = 16
+    max_time                    = '240.h'
 
 }
 
@@ -176,7 +177,7 @@ manifest {
     description     = """Pipeline to fetch metadata and raw FastQ files from public databases"""
     mainScript      = 'main.nf'
     nextflowVersion = '!>=22.10.1'
-    version         = '1.9dev'
+    version         = '1.9'
     doi             = 'https://doi.org/10.5281/zenodo.5070524'
 }
 
diff --git a/nextflow_schema.json b/nextflow_schema.json
@@ -43,8 +43,15 @@
                 "nf_core_pipeline": {
                     "type": "string",
                     "fa_icon": "fab fa-apple",
-                    "description": "Name of supported nf-core pipeline e.g.  'rnaseq'. A samplesheet for direct use with the pipeline will be created with the appropriate columns.",
-                    "enum": ["rnaseq", "viralrecon", "taxprofiler"]
+                    "description": "Name of supported nf-core pipeline e.g. 'rnaseq'. A samplesheet for direct use with the pipeline will be created with the appropriate columns.",
+                    "enum": ["rnaseq", "atacseq", "viralrecon", "taxprofiler"]
+                },
+                "nf_core_rnaseq_strandedness": {
+                    "type": "string",
+                    "fa_icon": "fas fa-car",
+                    "description": "Value for 'strandedness' entry added to samplesheet created when using '--nf_core_pipeline rnaseq'.",
+                    "help_text": "The default is 'auto' which can be used with nf-core/rnaseq v3.10 onwards to auto-detect strandedness during the pipeline execution.",
+                    "default": "auto"
                 },
                 "force_sratools_download": {
                     "type": "boolean",
diff --git a/workflows/sra.nf b/workflows/sra.nf
@@ -138,6 +138,7 @@ workflow SRA {
     SRA_TO_SAMPLESHEET (
         ch_sra_metadata,
         params.nf_core_pipeline ?: '',
+        params.nf_core_rnaseq_strandedness ?: 'auto',
         params.sample_mapping_fields
     )
 
diff --git a/workflows/synapse.nf b/workflows/synapse.nf
@@ -115,7 +115,8 @@ workflow SYNAPSE {
     //
     SYNAPSE_TO_SAMPLESHEET (
         ch_fastq,
-        params.nf_core_pipeline ?: ''
+        params.nf_core_pipeline ?: '',
+        params.nf_core_rnaseq_strandedness ?: 'auto'
     )
 
     //

Original file line number	Diff line number	Diff line change
`@@ -138,6 +138,7 @@ workflow SRA {`
`138`	`138`	`SRA_TO_SAMPLESHEET (`
`139`	`139`	`ch_sra_metadata,`
`140`	`140`	`params.nf_core_pipeline ?: '',`
	`141`	`+ params.nf_core_rnaseq_strandedness ?: 'auto',`
`141`	`142`	`params.sample_mapping_fields`
`142`	`143`	`)`
`143`	`144`
Original file line number	Diff line number	Diff line change
`@@ -115,7 +115,8 @@ workflow SYNAPSE {`
`115`	`115`	`//`
`116`	`116`	`SYNAPSE_TO_SAMPLESHEET (`
`117`	`117`	`ch_fastq,`
`118`		`- params.nf_core_pipeline ?: ''`
	`118`	`+ params.nf_core_pipeline ?: '',`
	`119`	`+ params.nf_core_rnaseq_strandedness ?: 'auto'`
`119`	`120`	`)`
`120`	`121`
`121`	`122`	`//`