diff --git a/docs/hello_nf-core/00_orientation.md b/docs/hello_nf-core/00_orientation.md index 80339368f3..3a720aeb6d 100644 --- a/docs/hello_nf-core/00_orientation.md +++ b/docs/hello_nf-core/00_orientation.md @@ -1,21 +1,34 @@ -# Orientation +# Getting started -## GitHub Codespaces +To start the course, launch the training environment by clicking the "Open in GitHub Codespaces" button below. +We recommend opening the training environment in a new browser tab (use right-click, ctrl-click or cmd-click depending on your equipment) so that you can read on while the environment loads. +You will need to keep these instructions open in parallel. -The GitHub Codespaces environment contains all the software, code and data necessary to work through this training course, so you don't need to install anything yourself. -However, you do need a (free) GitHub account to log in, and you should take a few minutes to familiarize yourself with the interface. +[![Open in GitHub Codespaces](https://github.com/codespaces/badge.svg)](https://codespaces.new/nextflow-io/training?quickstart=1&ref=master) -If you have not yet done so, please go through the [Environment Setup](../../envsetup/) mini-course before going any further. +## Training environment + +Our training environment runs on GitHub Codespaces (free Github account required) and contains all the software, code and data necessary to work through this training course, so you don't need to install anything yourself. + +The codespace is set up with a VSCode interface, which includes a filesystem explorer, a code editor and a terminal shell. +All instructions given during the course (e.g. 'open the file', 'edit the code' or 'run this command') refer to those three parts of the VScode interface unless otherwise specified. + +If you are working through this course by yourself, please go through the [Environment Setup](../../envsetup/) mini-course for further details before going any further. !!! warning - This training is designed for nf-core tools version 3.4.1, which should be the version installed in the codespace. If you use a different version of nf-core tooling you may have difficulty following along. + This training is designed for nf-core tools version 3.4.1, which should be the version installed in the codespace we provide. + If you use a different version of nf-core tooling, you may have difficulty following along. You can check what version is installed using the command`nf-core --version`. -## Working directory +## Get ready to work + +Once your codespace is running, there are two things you need to do before diving into the training: set your working directory for this specific course, and take a look at the materials provided. -Throughout this training course, we'll be working in the `hello-nf-core/` directory. +### Set the working directory + +By default, the codespace opens with the work directory set at the root of all training courses, but for this course, we'll be working in the `hello-nf-core/` directory. Change directory now by running this command in the terminal: @@ -25,7 +38,7 @@ cd hello-nf-core/ !!! tip - If for whatever reason you move out of this directory, you can always use the full path to return to it, assuming you're running this within the Github Codespaces training environment: + If for whatever reason you move out of this directory (e.g. your codespace goes to sleep), you can always use the full path to return to it, assuming you're running this within the Github Codespaces training environment: ```bash cd /workspaces/training/hello-nf-core @@ -33,7 +46,7 @@ cd hello-nf-core/ Now let's have a look at the contents of this directory. -## Materials provided +### Check out the materials provided You can explore the contents of this directory by using the file explorer on the left-hand side of the training workspace. Alternatively, you can use the `tree` command. @@ -46,26 +59,33 @@ Here we generate a table of contents to the second level down: tree . -L 2 ``` -If you run this inside `hello-nf-core`, you should see the following output: - -```console title="Directory contents" -. -├── greetings.csv -├── original-hello -│ ├── hello.nf -│ ├── modules -│ └── nextflow.config -└── solutions - ├── composable-hello - ├── core-hello-part2 - ├── core-hello-part3 - ├── core-hello-part4 - └── core-hello-start - -8 directories, 3 files -``` +If you run this inside `hello-nf-core`, you should see the following output. + +??? example "Directory contents" + + ```console + . + ├── greetings.csv + ├── original-hello + │ ├── hello.nf + │ ├── modules + │ └── nextflow.config + └── solutions + ├── composable-hello + ├── core-hello-part2 + ├── core-hello-part3 + ├── core-hello-part4 + └── core-hello-start + + 8 directories, 3 files + ``` -**Here's a summary of what you should know to get started:** +!!! note + + We use collapsible sections like this to include expected command output in a concise way. + Click on the colored box to expand the section and view its contents. + +**Content guide:** - **The `greetings.csv` file** is a CSV containing some minimal columnar data we use for testing purposes. @@ -74,4 +94,14 @@ If you run this inside `hello-nf-core`, you should see the following output: - **The `solutions` directory** contains the completed workflow scripts that result from each step of the course. They are intended to be used as a reference to check your work and troubleshoot any issues. -**Now, to begin the course, click on the arrow in the bottom right corner of this page.** +## Readiness checklist + +Think you're ready to dive in? + +- [ ] I understand the goal of this course and its prerequisites +- [ ] My codespace is up and running +- [ ] I've set my working directory appropriately + +If you can check all the boxes, you're good to go. + +**To continue to Part 1, click on the arrow in the bottom right corner of this page.** diff --git a/docs/hello_nf-core/01_run_demo.md b/docs/hello_nf-core/01_run_demo.md index e6e15048de..980fce4d15 100644 --- a/docs/hello_nf-core/01_run_demo.md +++ b/docs/hello_nf-core/01_run_demo.md @@ -4,7 +4,7 @@ In this first part of the Hello nf-core training course, we show you how to find We are going to use a pipeline called nf-core/demo that is maintained by the nf-core project as part of its inventory of pipelines for demonstrating code structure and tool operations. -Make sure you are in the `hello-nf-core/` directory as instructed in the [Orientation](./00_orientation.md). +Make sure your working directory is set to `hello-nf-core/` as instructed on the [Getting started](./00_orientation.md) page. --- @@ -18,7 +18,7 @@ In your web browser, go to https://nf-co.re/pipelines/ and type `demo` in the se ![search results](./img/search-results.png) -Click on the pipeline name, `demo`, to access the pipeline details page. +Click on the pipeline name, `demo`, to access the pipeline documentation page. Each released pipeline has a dedicated page that includes the following documentation sections: @@ -33,26 +33,30 @@ Whenever you are considering adopting a new pipeline, you should read the pipeli Have a look now and see if you can find out: -- which tools the pipeline will run (Check the tab: `Introduction`) -- which inputs and parameters the pipeline accepts or requires (Check the tab: `Parameters`) -- what are the outputs produced by the pipeline (Check the tab: `Output`) +- Which tools the pipeline will run (Check the tab: `Introduction`) +- Which inputs and parameters the pipeline accepts or requires (Check the tab: `Parameters`) +- What are the outputs produced by the pipeline (Check the tab: `Output`) - The `Introduction` tab provides an overview of the pipeline, including a visual representation (called a subway map) and a list of tools that are run as part of the pipeline. +#### 1.1.1. Pipeline overview - ![pipeline subway map](./img/nf-core-demo-subway-cropped.png) +The `Introduction` tab provides an overview of the pipeline, including a visual representation (called a subway map) and a list of tools that are run as part of the pipeline. - 1. Read QC (FASTQC) - 2. Adapter and quality trimming (SEQTK_TRIM) - 3. Present QC for raw reads (MULTIQC) +![pipeline subway map](./img/nf-core-demo-subway-cropped.png) - The documentation also provides an example input file (see below) and an example command line. +1. Read QC (FASTQC) +2. Adapter and quality trimming (SEQTK_TRIM) +3. Present QC for raw reads (MULTIQC) - ```bash - nextflow run nf-core/demo \ - -profile \ - --input samplesheet.csv \ - --outdir - ``` +#### 1.1.2. Example command line + +The documentation also provides an example input file (discussed further below) and an example command line. + +```bash +nextflow run nf-core/demo \ + -profile \ + --input samplesheet.csv \ + --outdir +``` You'll notice that the example command does NOT specify a workflow file, just the reference to the pipeline repository, `nf-core/demo`. @@ -61,10 +65,10 @@ Let's retrieve the code so we can examine this structure. ### 1.2. Retrieve the pipeline code -Once we've determined the pipeline appears to be suitable for our purposes, we're going to want to try it out. -Fortunately Nextflow makes it easy to retrieve pipeline from correctly-formatted repositories without having to download anything manually. +Once we've determined that the pipeline appears to be suitable for our purposes, let's try it out. +Fortunately Nextflow makes it easy to retrieve pipelines from correctly-formatted repositories without having to download anything manually. -Return to your terminal and run the following: +Let's return to the terminal and run the following: ```bash nextflow pull nf-core/demo @@ -91,25 +95,27 @@ nf-core/demo ``` You'll notice that the files are not in your current work directory. -By default, they are saved to `$NXF_HOME/assets`. +By default, Nextflow saves them to `$NXF_HOME/assets`. ```bash tree -L 2 $NXF_HOME/assets/ ``` -```console title="Output" -/workspaces/.nextflow/assets/ -└── nf-core - └── demo -``` +??? example "Directory contents" + + ```console + /workspaces/.nextflow/assets/ + └── nf-core + └── demo + ``` !!! note The full path may differ on your system if you're not using our training environment. -The location of the downloaded source code is intentionally 'out of the way' on the principle that these pipelines should be used more like libraries than code that you would directly interact with. +Nextflow keeps the downloaded source code intentionally 'out of the way' on the principle that these pipelines should be used more like libraries than code that you would directly interact with. -However, for the purposes of this training, we'd like to be able to poke around and see what's in there. +However, for the purposes of this training, we want to be able to poke around and see what's in there. So to make that easier, let's create a symbolic link to that location from our current working directory. ```bash @@ -122,11 +128,13 @@ This creates a shortcut that makes it easier to explore the code we just downloa tree -L 2 pipelines ``` -```console title="Output" -pipelines -└── nf-core - └── demo -``` +??? example "Directory contents" + + ```console + pipelines + └── nf-core + └── demo + ``` Now we can more easily peek into the source code as needed. @@ -145,14 +153,20 @@ Learn how to try out an nf-core pipeline with minimal effort. ## 2. Try out the pipeline with its test profile Conveniently, every nf-core pipeline comes with a test profile. -This is a minimal set of configuration settings for the pipeline to run using a small test dataset hosted in the [nf-core/test-datasets](https://github.com/nf-core/test-datasets) repository. It's a great way to quickly try out a pipeline at small scale. +This is a minimal set of configuration settings for the pipeline to run using a small test dataset hosted in the [nf-core/test-datasets](https://github.com/nf-core/test-datasets) repository. +It's a great way to quickly try out a pipeline at small scale. + +!!! note + + Nextflow's configuration profile system allows you to easily switch between different container engines or execution environments. + For more details, see [Hello Nextflow Part 6: Configuration](../hello_nextflow/06_hello_configuration.md). ### 2.1. Examine the test profile It's good practice to check what a pipeline's test profile specifies before running it. -The `test` profile for `nf-core/demo` is shown below: +The `test` profile for `nf-core/demo` lives in the configuration file `conf/test.config` and is shown below. -```groovy title="conf/test.config" linenums="1" hl_lines="26" +```groovy title="conf/test.config" linenums="1" hl_lines="8 26" /* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Nextflow config file for running minimal tests @@ -168,7 +182,7 @@ The `test` profile for `nf-core/demo` is shown below: process { resourceLimits = [ cpus: 4, - memory: '15.GB', + memory: '4.GB', time: '1.h' ] } @@ -183,16 +197,44 @@ params { } ``` -This tells us that the `nf-core/demo` test profile already specifies the input parameter, so you don't have to provide any input yourself. -However, the `outdir` parameter is not included in the test profile, so we will have to add it to the execution command using the `--outdir` flag. +You'll notice right away that the comment block at the top includes a usage example showing how to run the pipeline with this test profile. -### 2.2. Run the pipeline +```groovy title="conf/test.config" linenums="7" +Use as follows: + nextflow run nf-core/demo -profile test, --outdir +``` + +The only things we need to supply are what's shown between carets in the example command: `` and ``. + +As a reminder, `` refers to the choice of container system. All nf-core pipelines are designed to be usable with containers (Docker, Singularity, etc.) to ensure reproducibility and eliminate software installation issues. +So we'll need to specify whether we want to use Docker or Singularity to test the pipeline. -Our examination of the test profile above told us what pipeline argument(s) we need to specify: just `--outdir`. +The `--outdir ` part refers to the directory where Nextflow will write the pipeline's outputs. +We need to provide a name for it, which we can just make up. +If it does not exist already, Nextflow will create it for us at runtime. -We're also going to specify `-profile docker,test`, which by nf-core convention enables the use of Docker containers, and of course, invokes the test profile. +Moving on to the section after the comment block, the test profile shows us what has been pre-configured for testing: most notably, the `input` parameter is already set to point to a test dataset, so we don't need to provide our own data. +If you follow the link to the pre-configured input, you'll see it is a csv file containing sample identifiers and file paths for several experimental samples. + +```csv title="samplesheet_test_illumina_amplicon.csv" +sample,fastq_1,fastq_2 +SAMPLE1_PE,https://raw.githubusercontent.com/nf-core/test-datasets/viralrecon/illumina/amplicon/sample1_R1.fastq.gz,https://raw.githubusercontent.com/nf-core/test-datasets/viralrecon/illumina/amplicon/sample1_R2.fastq.gz +SAMPLE2_PE,https://raw.githubusercontent.com/nf-core/test-datasets/viralrecon/illumina/amplicon/sample2_R1.fastq.gz,https://raw.githubusercontent.com/nf-core/test-datasets/viralrecon/illumina/amplicon/sample2_R2.fastq.gz +SAMPLE3_SE,https://raw.githubusercontent.com/nf-core/test-datasets/viralrecon/illumina/amplicon/sample1_R1.fastq.gz, +SAMPLE3_SE,https://raw.githubusercontent.com/nf-core/test-datasets/viralrecon/illumina/amplicon/sample2_R1.fastq.gz, +``` -Let's try it! +This is called a samplesheet, and is the most common form of input to nf-core pipelines. + +!!! note + + Don't worry if you're not familiar with the data formats and types, it's not important for what follows. + +So this confirms that we have everything we need to try out the pipeline. + +### 2.2. Run the pipeline + +Let's decide to use Docker for the container system and `demo-results` as the output directory, and we're ready to run the test command: ```bash nextflow run nf-core/demo -profile docker,test --outdir demo-results @@ -200,78 +242,85 @@ nextflow run nf-core/demo -profile docker,test --outdir demo-results Here's the console output from the pipeline: -```console title="Output" - N E X T F L O W ~ version 24.10.0 - -Launching `https://github.com/nf-core/demo` [maniac_jones] DSL2 - revision: 04060b4644 [master] - - ------------------------------------------------------- - ,--./,-. - ___ __ __ __ ___ /,-._.--~' - |\ | |__ __ / ` / \ |__) |__ } { - | \| | \__, \__/ | \ |___ \`-._,-`-, - `._,._,' - nf-core/demo 1.0.1 ------------------------------------------------------- -Input/output options - input : https://raw.githubusercontent.com/nf-core/test-datasets/viralrecon/samplesheet/samplesheet_test_illumina_amplicon.csv - outdir : results - -Institutional config options - config_profile_name : Test profile - config_profile_description: Minimal test dataset to check pipeline function - -Core Nextflow options - revision : master - runName : maniac_jones - containerEngine : docker - launchDir : /workspaces/training/side-quests/nf-core/nf-core-demo - workDir : /workspaces/training/side-quests/nf-core/nf-core-demo/work - projectDir : /workspaces/.nextflow/assets/nf-core/demo - userName : gitpod - profile : docker,test - configFiles : - -!! Only displaying parameters that differ from the pipeline defaults !! -------------------------------------------------------* The pipeline - https://doi.org/10.5281/zenodo.12192442 - -* The nf-core framework - https://doi.org/10.1038/s41587-020-0439-x - -* Software dependencies - https://github.com/nf-core/demo/blob/master/CITATIONS.md - -executor > local (7) -[3c/a00024] NFC…_DEMO:DEMO:FASTQC (SAMPLE2_PE) | 3 of 3 ✔ -[94/d1d602] NFC…O:DEMO:SEQTK_TRIM (SAMPLE2_PE) | 3 of 3 ✔ -[ab/460670] NFCORE_DEMO:DEMO:MULTIQC | 1 of 1 ✔ --[nf-core/demo] Pipeline completed successfully- -Completed at: 05-Mar-2025 09:46:21 -Duration : 1m 54s -CPU hours : (a few seconds) -Succeeded : 7 -``` +??? example "Output" + + ```console + N E X T F L O W ~ version 25.04.3 + + Launching `https://github.com/nf-core/demo` [happy_varahamihira] DSL2 - revision: db7f526ce1 [master] + + + ------------------------------------------------------ + ,--./,-. + ___ __ __ __ ___ /,-._.--~' + |\ | |__ __ / ` / \ |__) |__ } { + | \| | \__, \__/ | \ |___ \`-._,-`-, + `._,._,' + nf-core/demo 1.0.2 + ------------------------------------------------------ + Input/output options + input : https://raw.githubusercontent.com/nf-core/test-datasets/viralrecon/samplesheet/samplesheet_test_illumina_amplicon.csv + outdir : demo-results + + Institutional config options + config_profile_name : Test profile + config_profile_description: Minimal test dataset to check pipeline function + + Generic options + trace_report_suffix : 2025-10-30_13-22-01 + + Core Nextflow options + revision : master + runName : happy_varahamihira + containerEngine : docker + launchDir : /workspaces/training/hello-nf-core + workDir : /workspaces/training/hello-nf-core/work + projectDir : /workspaces/.nextflow/assets/nf-core/demo + userName : root + profile : docker,test + configFiles : /workspaces/.nextflow/assets/nf-core/demo/nextflow.config + + !! Only displaying parameters that differ from the pipeline defaults !! + ------------------------------------------------------ + * The pipeline + https://doi.org/10.5281/zenodo.12192442 + + * The nf-core framework + https://doi.org/10.1038/s41587-020-0439-x + + * Software dependencies + https://github.com/nf-core/demo/blob/master/CITATIONS.md -You see that there is more console output than when you run a basic Netxflow pipeline. + + executor > local (7) + [db/fae3ff] NFCORE_DEMO:DEMO:FASTQC (SAMPLE3_SE) [100%] 3 of 3 ✔ + [d0/f6ea55] NFCORE_DEMO:DEMO:SEQTK_TRIM (SAMPLE1_PE) [100%] 3 of 3 ✔ + [af/e6da56] NFCORE_DEMO:DEMO:MULTIQC [100%] 1 of 1 ✔ + -[nf-core/demo] Pipeline completed successfully- + ``` + +If your output matches that, congratulations! You've just run your first nf-core pipeline. + +You'll notice that there is more a lot more console output than when you run a basic Nextflow pipeline. There's a header that includes a summary of the pipeline's version, inputs and outputs, and a few elements of configuration. +!!! note + + Your output will show different timestamps, execution names, and file paths, but the overall structure and process execution should be similar. + Moving on to the execution output, let's have a look at the lines that tell us what processes were run: ```console title="Output (subset)" -[3c/a00024] NFC…_DEMO:DEMO:FASTQC (SAMPLE2_PE) | 3 of 3 ✔ -[94/d1d602] NFC…O:DEMO:SEQTK_TRIM (SAMPLE2_PE) | 3 of 3 ✔ -[ab/460670] NFCORE_DEMO:DEMO:MULTIQC | 1 of 1 ✔ +[db/fae3ff] NFCORE_DEMO:DEMO:FASTQC (SAMPLE3_SE) [100%] 3 of 3 ✔ +[d0/f6ea55] NFCORE_DEMO:DEMO:SEQTK_TRIM (SAMPLE1_PE) [100%] 3 of 3 ✔ +[af/e6da56] NFCORE_DEMO:DEMO:MULTIQC [100%] 1 of 1 ✔ ``` This tells us that three processes were run, corresponding to the three tools shown in the pipeline documentation page on the nf-core website: FASTQC, SEQTK_TRIM and MULTIQC. -!!! note - - The full process names as shown here, such as `NFCORE_DEMO:DEMO:MULTIQC`, are longer than what you may have seen in the introductory Hello Nextflow material. - These includes the names of their parent workflows and reflect the modularity of the pipeline code. - We will go into more detail about that shortly. +The full process names as shown here, such as `NFCORE_DEMO:DEMO:MULTIQC`, are longer than what you may have seen in the introductory Hello Nextflow material. +These include the names of their parent workflows and reflect the modularity of the pipeline code. +We'll go into more detail about that in a little bit. ### 2.3. Examine the pipeline's outputs @@ -281,39 +330,40 @@ Finally, let's have a look at the `demo-results` directory produced by the pipel tree -L 2 demo-results ``` -```console title="Output" -demo-results/ -├── fastqc -│ ├── SAMPLE1_PE -│ ├── SAMPLE2_PE -│ └── SAMPLE3_SE -├── fq -│ ├── SAMPLE1_PE -│ ├── SAMPLE2_PE -│ └── SAMPLE3_SE -├── multiqc -│ ├── multiqc_data -│ ├── multiqc_plots -│ └── multiqc_report.html -└── pipeline_info - ├── execution_report_2025-03-05_09-44-26.html - ├── execution_timeline_2025-03-05_09-44-26.html - ├── execution_trace_2025-03-05_09-44-26.txt - ├── nf_core_pipeline_software_mqc_versions.yml - ├── params_2025-03-05_09-44-29.json - └── pipeline_dag_2025-03-05_09-44-26.html -``` - -If you're curious about the specifics what that all means, check out [the nf-core/demo pipeline documentation page](https://nf-co.re/demo/1.0.1/). +??? example "Directory contents" + ```console + demo-results/ + ├── fastqc + │ ├── SAMPLE1_PE + │ ├── SAMPLE2_PE + │ └── SAMPLE3_SE + ├── fq + │ ├── SAMPLE1_PE + │ ├── SAMPLE2_PE + │ └── SAMPLE3_SE + ├── multiqc + │ ├── multiqc_data + │ ├── multiqc_plots + │ └── multiqc_report.html + └── pipeline_info + ├── execution_report_2025-03-05_09-44-26.html + ├── execution_timeline_2025-03-05_09-44-26.html + ├── execution_trace_2025-03-05_09-44-26.txt + ├── nf_core_pipeline_software_mqc_versions.yml + ├── params_2025-03-05_09-44-29.json + └── pipeline_dag_2025-03-05_09-44-26.html + ``` + +That might seem like a lot. At this stage, what's important to observe is that the results are organized by module, and there is additionally a directory called `pipeline_info` containing various timestamped reports about the pipeline execution. This is standard for nf-core pipelines. -Congratulations! You have just run your first nf-core pipeline. +To learn more about the `nf-core/demo` pipeline's outputs, check out its [documentation page](https://nf-co.re/demo/1.0.2/docs/output/). ### Takeaway -You know how to run an nf-core pipeline using its built-in test profile. +You know how to run an nf-core pipeline using its built-in test profile and where to find its outputs. ### What's next? @@ -323,277 +373,217 @@ Learn how the pipeline code is organized. ## 3. Examine the pipeline code structure -The nf-core project enforces strong guidelines for how pipelines are structured, and how the code is organized, configured and documented. +Now that we've successfully run the pipeline as users, let's shift our perspective to look at how nf-core pipelines are structured internally. -Let's have a look at how the pipeline code is organized in the `nf-core/demo` repository (using the `pipelines` symlink we created earlier). -You can either use `tree` or use the file explorer in your IDE. +The nf-core project enforces strong guidelines for how pipelines are structured, and for how the code is organized, configured and documented. +Understanding how this is all organized is the first step toward developing your own nf-core-compatible pipelines, which we will tackle in Part 2 of this course. + +Let's have a look at how the pipeline code is organized in the `nf-core/demo` repository, using the `pipelines` symlink we created earlier. + +You can either use `tree` or use the file explorer to find and open the `nf-core/demo` directory. ```bash tree -L 1 pipelines/nf-core/demo ``` -```console title="Output (top-level only)" -pipelines/nf-core/demo -├── assets -├── CHANGELOG.md -├── CITATIONS.md -├── CODE_OF_CONDUCT.md -├── conf -├── docs -├── LICENSE -├── main.nf -├── modules -├── modules.json -├── nextflow_schema.json -├── nextflow.config -├── README.md -├── subworkflows -├── tower.yml -└── workflows -``` +??? example "Directory contents" + + ```console + pipelines/nf-core/demo + ├── assets + ├── CHANGELOG.md + ├── CITATIONS.md + ├── CODE_OF_CONDUCT.md + ├── conf + ├── docs + ├── LICENSE + ├── main.nf + ├── modules + ├── modules.json + ├── nextflow.config + ├── nextflow_schema.json + ├── nf-test.config + ├── README.md + ├── ro-crate-metadata.json + ├── subworkflows + ├── tests + ├── tower.yml + └── workflows + ``` + +There's a lot going on in there, so we'll tackle this step by step. + +First, let's note that at the top level, you can find a README file with summary information, as well as accessory files that summarize project information such as licensing, contribution guidelines, citation and code of conduct. +Detailed pipeline documentation is located in the `docs` directory. +All of this content is used to generate the web pages on the nf-core website programmatically, so they're always up to date with the code. -There's a lot going on in there, so we'll tackle this in stages. -We're going to look at the following categories: +Now, for the rest, we're going to divide our exploration in three stages: 1. Pipeline code components (`main.nf`, `workflows`, `subworkflows`, `modules`) 2. Configuration, parameters and inputs -3. Documentation and related assets +3. Input validation -Let's start with the code proper, though note that for now, we're going to focus on how everything is organized, without looking at the actual code just yet. +Let's start with the pipeline code components. +We're going to focus on the file hierarchy and structural organization, rather than diving into the code within individual files. ### 3.1. Pipeline code components -The pipeline code organization follows a modular structure that is designed to maximize code reuse. +The standard nf-core pipeline code organization follows a modular structure that is designed to maximize code reuse, as introduced in [Hello Modules](../hello_nextflow/04_hello_modules.md), Part 4 of the [Hello Nextflow](../hello_nextflow/index.md) course, although in true nf-core fashion, this is implemented with a bit of additional complexity. +Specifically, nf-core pipelines make abundant use of subworkflows, i.e. workflow scripts that are imported by a parent workflow. + +That may sound a bit abstract, so let's take a look how this is used in practice in the `nf-core/demo` pipeline. !!! note - We won't go over the actual code for how these modular components are connected, because there is some additional complexity associated with the use of subworkflows that can be confusing, and understanding that is not necessary at this stage of the training. - For now, we're going to focus on the logic of this modular organization. + We won't go over the actual code for _how_ these modular components are connected, because there is some additional complexity associated with the use of subworkflows that can be confusing, and understanding that is not necessary at this stage of the training. + For now, we're going to focus on the overall organization and logic. -#### 3.1.1. Overall organization and `main.nf` script +#### 3.1.1. General overview -At the top level, there is the `main.nf` script, which is the entrypoint Nextflow starts from when we execute `nextflow run nf-core/demo`. That means when you run `nextflow run nf-core/demo` to run the pipeline, Nextflow automatically finds and executes the `main.nf` script, and everything else will flow from there. +Here is what the relationships between the relevant code components look like for the `nf-core/demo` pipeline: -In practice, the `main.nf` script calls the actual workflow of interest, stored inside the `workflows` folder, called `demo.nf`. It also calls a few 'housekeeping' subworkflows that we're going to ignore for now. +
+ --8<-- "docs/hello_nf-core/img/nf-core_demo_code_organization.svg" +
-```bash -tree pipelines/nf-core/demo/workflows -``` +There is a so-called _entrypoint_ script called `main.nf`, which acts as a wrapper for two kinds of nested workflows: the workflow containing the actual analysis logic, located under `workflows/` and called `demo.nf`, and a set of housekeeping workflows located under `subworkflows/`. +The `demo.nf` workflow calls on **modules** located under `modules/`; these contain the **processes** that will perform the actual analysis steps. -```console title="Output" -pipelines/nf-core/demo/workflows -└── demo.nf -``` +Now, let's review these components in turn. -The `demo.nf` workflow itself calls out to various script components, namely, modules and subworkflows, stored in the corresponding `modules` and `subworkflows` folders. +#### 3.1.2. The entrypoint script: `main.nf` -- **Module:** A wrapper around a single process. -- **Subworkflow:** A mini workflow that calls two or more modules and is designed to be called by another workflow. +The `main.nf` script is the entrypoint that Nextflow starts from when we execute `nextflow run nf-core/demo`. +That means when you run `nextflow run nf-core/demo` to run the pipeline, Nextflow automatically finds and executes the `main.nf` script. +This works for any Nextflow pipeline that follows this conventional naming and structure, not just nf-core pipelines. -Here's an overview of the nested structure of a workflow composed of subworkflows and modules: +Using an entrypoint script makes it easy to run standardized 'housekeeping' subworkflows before and after the actual analysis script gets run. +We'll go over those after we've reviewed the actual analysis workflow and its modules. -
- --8<-- "docs/side_quests/img/nf-core/nested.excalidraw.svg" -
+#### 3.1.3. The analysis script: `workflows/demo.nf` -Not all workflows use subworkflows to organize their modules, but this is a very common pattern that makes it possible to reuse chunks of code across different pipelines in a way that is flexible while minimizing maintenance burden. +The `workflows/demo.nf` workflow is where the central logic of the pipeline is stored. +It is structured much like a normal Nextflow workflow, except it is designed to be called from a parent workflow, which requires a few extra features. +We'll cover the relevant differences in the next part of this course, when we tackle the conversion of the simple Hello pipeline from Hello Nextflow into an nf-core-compatible form. -Within this structure, `modules` and `subworkflows` are further organized into `local` and `nf-core` folders. -The `nf-core` folder is for components that have come from the nf-core GitHub repository, while the `local` folder is for components that have been developed independently. -Usually these are operations that very specific to that pipeline. +The `demo.nf` workflow calls on **modules** located under `modules/`, which we'll review next. -Let's take a peek into those directories. +!!! note + + Some nf-core analysis workflows display additional levels of nesting by calling on lower-level subworkflows. + This is mostly used for wrapping two or more modules that are commonly used together into easily reusable pipeline segments. + You can see some examples by browsing available [nf-core subworkflows](https://nf-co.re/subworkflows/) on the nf-core website. -#### 3.1.2. Modules + When the analysis script uses subworkflows, those are stored under the `subworkflows/` directory. + +#### 3.1.4. The modules The modules are where the process code lives, as described in [Part 4 of the Hello Nextflow training course](../hello_nextflow/04_hello_modules.md). -In the nf-core project, modules are organized using a nested structure that refers to toolkit and tool names. -The module code file describing the process is always called `main.nf`, and is accompanied by tests and `.yml` files. +In the nf-core project, modules are organized using a multi-level nested structure that reflect both their origin and their contents. +At the top level, modules are differentiated as either `nf-core` or `local` (not part of the nf-core project), and then further placed into a directory named after the tool(s) they wrap. +If the tool belongs to a toolkit (i.e. a package containing multiple tools) then there is an intermediate directory level named after the toolkit. + +You can see this applied in practice to the `nf-core/demo` pipeline modules: ```bash -tree -L 4 pipelines/nf-core/demo/modules +tree -L 3 pipelines/nf-core/demo/modules ``` -```console title="Output" -pipelines/nf-core/demo/modules -└── nf-core - ├── fastqc - │   ├── environment.yml - │   ├── main.nf - │   ├── meta.yml - │   └── tests - ├── multiqc - │   ├── environment.yml - │   ├── main.nf - │   ├── meta.yml - │   └── tests - └── seqtk - └── trim - ├── environment.yml - ├── main.nf +??? example "Directory contents" + + ```console + pipelines/nf-core/demo/modules + └── nf-core + ├── fastqc + │   ├── environment.yml + │   ├── main.nf + │   ├── meta.yml + │   └── tests + ├── multiqc + │   ├── environment.yml + │   ├── main.nf + │   ├── meta.yml + │   └── tests + └── seqtk + └── trim + ├── environment.yml + ├── main.nf ├── meta.yml └── tests -``` + ``` Here you see that the `fastqc` and `multiqc` modules sit at the top level within the `nf-core` modules, whereas the `trim` module sits under the toolkit that it belongs to, `seqtk`. In this case there are no `local` modules. -#### 3.1.3. Subworkflows +The module code file describing the process is always called `main.nf`, and is accompanied by tests and `.yml` files which we'll ignore for now. + +Taken together, the entrypoint workflow, analysis workflow and modules are sufficient for running the 'interesting' parts of the pipeline. +However, we know there are also housekeeping subworkflows in there, so let's look at those now. -As noted above, subworkflows function as wrappers that call two or more modules. +#### 3.1.5. The housekeeping subworkflows -In an nf-core pipeline, the subworkflows are divided into `local` and `nf-core` directories, and each subworkflow has its own nested directory structure with its own `main.nf` script. +Like modules, subworkflows are differentiated into `local` and `nf-core` directories, and each subworkflow has its own nested directory structure with its own `main.nf` script, tests and `.yml` file. ```bash -tree -L 4 pipelines/nf-core/demo/subworkflows +tree -L 3 pipelines/nf-core/demo/subworkflows ``` -```console title="Output" -pipelines/nf-core/demo/subworkflows -├── local -│   └── utils_nfcore_demo_pipeline -│   └── main.nf -└── nf-core - ├── utils_nextflow_pipeline - │   ├── main.nf - │   ├── meta.yml - │   └── tests - ├── utils_nfcore_pipeline - │   ├── main.nf - │   ├── meta.yml - │   └── tests - └── utils_nfschema_plugin - ├── main.nf - ├── meta.yml +??? example "Directory contents" + + ```console + pipelines/nf-core/demo/subworkflows + ├── local + │   └── utils_nfcore_demo_pipeline + │   └── main.nf + └── nf-core + ├── utils_nextflow_pipeline + │   ├── main.nf + │   ├── meta.yml + │   └── tests + ├── utils_nfcore_pipeline + │   ├── main.nf + │   ├── meta.yml + │   └── tests + └── utils_nfschema_plugin + ├── main.nf + ├── meta.yml └── tests -``` + ``` -In the case of the `nf-core/demo` pipeline, the subworkflows involved are all 'utility' or housekeeping subworkflows, as denoted by the `utils_` prefix in their names. +As noted above, the `nf-core/demo` pipeline does not include any analysis-specific subworkflows, so all the subworkflows we see here are so-called 'housekeeping' or 'utility' workflows, as denoted by the `utils_` prefix in their names. These subworkflows are what produces the fancy nf-core header in the console output, among other accessory functions. -Other pipelines may also use subworkflows as part of the main workflow of interest. +!!! tip -!!! note + Aside from their naming pattern, another indication that these subworkflows do not perform any truly analysis-related function is that they do not call any processes at all. - If you would like to learn how to compose workflows with subworkflows, see the [Workflows of Workflows](https://training.nextflow.io/latest/side_quests/workflows_of_workflows/) Side Quest (also known as 'the WoW side quest'). +This completes the round-up of core code components that constitute the `nf-core/demo` pipeline. +Now let's take a look at the remaining elements that you should know a little bit about before diving into development: pipeline configuration and input validation. -### 3.2. Configuration +### 3.2. Pipeline configuration -The nf-core project applies guidelines for pipeline configuration that aim to build on Nextflow's flexible customization options in a way that provides greater consistency and maintainability across pipelines. +You've learned previously that Nextflow offers many options for configuring pipeline execution, be it in terms of inputs and parameters, computing resources, and other aspects of orchestration. +The nf-core project applies highly standardized guidelines for pipeline configuration that aim to build on Nextflow's flexible customization options in a way that provides greater consistency and maintainability across pipelines. -The central configuration file `nextflow.config` is used to set default values for parameters and other configuration options. The majority of these configuration options are applied by default while others (e.g., software dependency profiles) are included as optional profiles. +The central configuration file `nextflow.config` is used to set default values for parameters and other configuration options. +The majority of these configuration options are applied by default while others (e.g., software dependency profiles) are included as optional profiles. There are several additional configuration files that are stored in the `conf` folder and which can be added to the configuration by default or optionally as profiles: -- `base.config`: A 'blank slate' config file, appropriate for general use on most high-performance computing. environments. This defines broad bins of resource usage, for example, which are convenient to apply to modules. +- `base.config`: A 'blank slate' config file, appropriate for general use on most high-performance computing environments. This defines broad bins of resource usage, for example, which are convenient to apply to modules. - `modules.config`: Additional module directives and arguments. -- `test.config`: A profile to run the pipeline with minimal test data, which we used when we ran the demo pipeline in the previous section (code shown there). +- `test.config`: A profile to run the pipeline with minimal test data, which we used when we ran the demo pipeline. - `test_full.config`: A profile to run the pipeline with a full-sized test dataset. -### 3.3. Documentation and related assets +We will touch a few of those files later in the course. -At the top level, you can find a README file with summary information, as well as accessory files that summarize project information such as licensing, contribution guidelines, citation and code of conduct. - -Detailed pipeline documentation is located in the `docs` directory. -This content is used to generate the web pages on the nf-core website. - -In addition to these human-readable documents, there are two JSON files that provide useful machine-readable information describing parameters and input requirements, `nextflow_schema.json` and `assets/schema_input.json`. - -The `nextflow_schema.json` is a file used to store information about the pipeline parameters including type, description and help text in a machine readable format. -The schema is used for various purposes, including automated parameter validation, help text generation, and interactive parameter form rendering in UI interfaces. - -```json title="assets/nextflow_schema.json (not showing full file)" linenums="1" -{ - "$schema": "https://json-schema.org/draft/2020-12/schema", - "$id": "https://raw.githubusercontent.com/nf-core/demo/master/nextflow_schema.json", - "title": "nf-core/demo pipeline parameters", - "description": "An nf-core demo pipeline", - "type": "object", - "$defs": { - "input_output_options": { - "title": "Input/output options", - "type": "object", - "fa_icon": "fas fa-terminal", - "description": "Define where the pipeline should find input data and save output data.", - "required": ["input", "outdir"], - "properties": { - "input": { - "type": "string", - "format": "file-path", - "exists": true, - "schema": "assets/schema_input.json", - "mimetype": "text/csv", - "pattern": "^\\S+\\.csv$", - "description": "Path to comma-separated file containing information about the samples in the experiment.", - "help_text": "You will need to create a design file with information about the samples in your experiment before running the pipeline. Use this parameter to specify its location. It has to be a comma-separated file with 3 columns, and a header row. See [usage docs](https://nf-co.re/demo/usage#samplesheet-input).", - "fa_icon": "fas fa-file-csv" - }, - "outdir": { - "type": "string", - "format": "directory-path", - "description": "The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure.", - "fa_icon": "fas fa-folder-open" - }, - "email": { - "type": "string", - "description": "Email address for completion summary.", - "fa_icon": "fas fa-envelope", - "help_text": "Set this parameter to your e-mail address to get a summary e-mail with details of the run sent to you when the workflow exits. If set in your user config file (`~/.nextflow/config`) then you don't need to specify this on the command line for every run.", - "pattern": "^([a-zA-Z0-9_\\-\\.]+)@([a-zA-Z0-9_\\-\\.]+)\\.([a-zA-Z]{2,5})$" - }, - "multiqc_title": { - "type": "string", - "description": "MultiQC report title. Printed as page header, used for filename if not otherwise specified.", - "fa_icon": "fas fa-file-signature" - } - } - }, -(truncated) -``` - -The `schema_input.json` is a file used to define the input samplesheet structure. -Each column can have a type, pattern, description and help text in a machine readable format. - -```json title="assets/schema_input.json" linenums="1" -{ - "$schema": "https://json-schema.org/draft/2020-12/schema", - "$id": "https://raw.githubusercontent.com/nf-core/demo/master/assets/schema_input.json", - "title": "nf-core/demo pipeline - params.input schema", - "description": "Schema for the file provided with params.input", - "type": "array", - "items": { - "type": "object", - "properties": { - "sample": { - "type": "string", - "pattern": "^\\S+$", - "errorMessage": "Sample name must be provided and cannot contain spaces", - "meta": ["id"] - }, - "fastq_1": { - "type": "string", - "format": "file-path", - "exists": true, - "pattern": "^\\S+\\.f(ast)?q\\.gz$", - "errorMessage": "FastQ file for reads 1 must be provided, cannot contain spaces and must have extension '.fq.gz' or '.fastq.gz'" - }, - "fastq_2": { - "type": "string", - "format": "file-path", - "exists": true, - "pattern": "^\\S+\\.f(ast)?q\\.gz$", - "errorMessage": "FastQ file for reads 2 cannot contain spaces and must have extension '.fq.gz' or '.fastq.gz'" - } - }, - "required": ["sample", "fastq_1"] - } -} -``` +### 3.3. Inputs and validation -The schema is used for various purposes, including automated validation, and providing helpful error messages. +As we noted earlier, when we examined the `nf-core/demo` pipeline's test profile, it is designed to take as input a samplesheet containing file paths and sample identifiers. +The file paths linked to real data located in the `nf-core/test-datasets` repository. -An example samplesheet is provided under the `assets` directory: +An example samplesheet is also provided under the `assets` directory, although the paths in this one are not real. ```csv title="assets/samplesheet.csv" linenums="1" sample,fastq_1,fastq_2 @@ -602,17 +592,34 @@ SAMPLE_SINGLE_END,/path/to/fastq/files/AEG588A4_S4_L003_R1_001.fastq.gz, ``` -!!! note +This particular samplesheet is fairly simple, but some pipelines run on samplesheets that are more complex, with a lot more metadata associated with the primary inputs. + +Unfortunately, because these files can be difficult to check by eye, improper formatting of input data is a very common source of pipeline failures. +A related problem is when parameters are provided incorrectly. - The paths in this example samplesheet are not real. - For paths to real data files, you should look in the test profiles, which link to data in the `nf-core/test-datasets` repository. +The solution to these problems is to run automated validation checks on all input files to ensure they contain the expected types of information, formatted correctly, and on parameters to ensure they are of the expected type. +This is called input validation, and should ideally be done _before_ trying to run a pipeline, rather than waiting for the pipeline to fail to find out there was a problem with the inputs. - In general, it's considered good practice to link out to example data rather than include it in the pipeline code repository, unless the example data is of trivial size (as is the case for the `greetings.csv` in the Hello Nextflow training series). +Just like for configuration, the nf-core project is very opinionated about input validation, and recommends the use of the [nf-schema plugin](https://nextflow-io.github.io/nf-schema/latest/), a Nextflow plugin that provides comprehensive validation capabilities for Nextflow pipelines. + +We'll cover this topic in more detail in Part 5 of this course. +For now, just be aware that there are two JSON files provided for that purpose, `nextflow_schema.json` and `assets/schema_input.json`. + +The `nextflow_schema.json` is a file used to store information about the pipeline parameters including type, description and help text in a machine readable format. +This is used for various purposes, including automated parameter validation, help text generation, and interactive parameter form rendering in UI interfaces. + +The `schema_input.json` is a file used to define the input samplesheet structure. +Each column can have a type, pattern, description and help text in a machine readable format. +The schema is used for various purposes, including automated validation, and providing helpful error messages. ### Takeaway -You know what are the main components of an nf-core pipeline and how the code is organized, what are the main elements of configuration, and what are some additional sources of information that can be useful. +You know what are the main components of an nf-core pipeline and how the code is organized; where the main elements of configuration are located; and you're aware of what input validation is for. ### What's next? Take a break! That was a lot. When you're feeling refreshed and ready, move on to the next section to apply what you've learned to write an nf-core compatible pipeline. + +!!! tip + + If you would like to learn how to compose workflows with subworkflows before moving on to the next part, check out the [Workflows of Workflows](../side_quests/workflows_of_workflows/) Side Quest. diff --git a/docs/hello_nf-core/02_rewrite_hello.md b/docs/hello_nf-core/02_rewrite_hello.md index 3971494397..49f87596f4 100644 --- a/docs/hello_nf-core/02_rewrite_hello.md +++ b/docs/hello_nf-core/02_rewrite_hello.md @@ -1,6 +1,6 @@ # Part 2: Rewrite Hello for nf-core -In this second part of the Hello nf-core training course, we show you how to create an nf-core compatible version of the pipeline produced by the [Hello Nextflow](../hello_nextflow/index.md) course. +In this second part of the Hello nf-core training course, we show you how to create an nf-core compatible version of the pipeline produced by the [Hello Nextflow](../hello_nextflow/index.md) beginners' course. You'll have noticed in the first section of the training that nf-core pipelines follow a fairly elaborate structure with a lot of accessory files. Creating all that from scratch would be very tedious, so the nf-core community has developed tooling to do it from a template instead, to bootstrap the process. @@ -86,70 +86,71 @@ View the contents of the new directory to see how much work you saved yourself b tree core-hello ``` -```console title="Output" -core-hello/ -├── assets -│ ├── samplesheet.csv -│ └── schema_input.json -├── conf -│ ├── base.config -│ ├── modules.config -│ ├── test.config -│ └── test_full.config -├── docs -│ ├── output.md -│ ├── README.md -│ └── usage.md -├── main.nf -├── modules.json -├── nextflow.config -├── nextflow_schema.json -├── README.md -├── subworkflows -│ ├── local -│ │ └── utils_nfcore_hello_pipeline -│ │ └── main.nf -│ └── nf-core -│ ├── utils_nextflow_pipeline -│ │ ├── main.nf -│ │ ├── meta.yml -│ │ └── tests -│ │ ├── main.function.nf.test -│ │ ├── main.function.nf.test.snap -│ │ ├── main.workflow.nf.test -│ │ ├── nextflow.config -│ │ └── tags.yml -│ ├── utils_nfcore_pipeline -│ │ ├── main.nf -│ │ ├── meta.yml -│ │ └── tests -│ │ ├── main.function.nf.test -│ │ ├── main.function.nf.test.snap -│ │ ├── main.workflow.nf.test -│ │ ├── main.workflow.nf.test.snap -│ │ ├── nextflow.config -│ │ └── tags.yml -│ └── utils_nfschema_plugin -│ ├── main.nf -│ ├── meta.yml -│ └── tests -│ ├── main.nf.test -│ ├── nextflow.config -│ └── nextflow_schema.json -└── workflows - └── hello.nf - -14 directories, 36 files -``` +??? example "Directory contents" + + ```console + core-hello/ + ├── assets + │ ├── samplesheet.csv + │ └── schema_input.json + ├── conf + │ ├── base.config + │ ├── modules.config + │ ├── test.config + │ └── test_full.config + ├── docs + │ ├── output.md + │ ├── README.md + │ └── usage.md + ├── main.nf + ├── modules.json + ├── nextflow.config + ├── nextflow_schema.json + ├── README.md + ├── subworkflows + │ ├── local + │ │ └── utils_nfcore_hello_pipeline + │ │ └── main.nf + │ └── nf-core + │ ├── utils_nextflow_pipeline + │ │ ├── main.nf + │ │ ├── meta.yml + │ │ └── tests + │ │ ├── main.function.nf.test + │ │ ├── main.function.nf.test.snap + │ │ ├── main.workflow.nf.test + │ │ └── nextflow.config + │ ├── utils_nfcore_pipeline + │ │ ├── main.nf + │ │ ├── meta.yml + │ │ └── tests + │ │ ├── main.function.nf.test + │ │ ├── main.function.nf.test.snap + │ │ ├── main.workflow.nf.test + │ │ ├── main.workflow.nf.test.snap + │ │ └── nextflow.config + │ └── utils_nfschema_plugin + │ ├── main.nf + │ ├── meta.yml + │ └── tests + │ ├── main.nf.test + │ ├── nextflow.config + │ └── nextflow_schema.json + └── workflows + └── hello.nf + + 14 directories, 34 files + ``` That's a lot of files! -Don't worry too much right now about what they all are; we are going to walk through the important parts together in the course of this training. +Hopefully you'll recognize a lot of them as the same we came across when we explored the `nf-core/demo` pipeline structure. +But don't worry if you're still feeling a little lost; we'll walk through the important parts together in the course of this training. !!! note One important difference compared to the `nf-core/demo` pipeline we examined in the first part of this training is that there is no `modules` directory. - This is because we didn't include any of the default nf-core modules. + This is because we didn't elect to include any of the default nf-core modules. ### 1.2. Test that the scaffold is functional @@ -159,37 +160,39 @@ Believe it or not, even though you haven't yet added any modules to make it do r nextflow run ./core-hello -profile docker,test --outdir core-hello-results ``` -```console title="Output" - N E X T F L O W ~ version 24.10.4 - -Launching `core-hello/main.nf` [special_ride] DSL2 - revision: c31b966b36 - -Downloading plugin nf-schema@2.2.0 -Input/output options - input : https://raw.githubusercontent.com/nf-core/test-datasets/viralrecon/samplesheet/samplesheet_test_illumina_amplicon.csv - outdir : core-hello-results - -Institutional config options - config_profile_name : Test profile - config_profile_description: Minimal test dataset to check pipeline function - -Generic options - trace_report_suffix : 2025-05-14_10-01-18 - -Core Nextflow options - runName : special_ride - containerEngine : docker - launchDir : /workspaces/training/hello-nf-core - workDir : /workspaces/training/hello-nf-core/work - projectDir : /workspaces/training/hello-nf-core/core-hello - userName : root - profile : docker,test - configFiles : - -!! Only displaying parameters that differ from the pipeline defaults !! ------------------------------------------------------- --[core/hello] Pipeline completed successfully -``` +??? example "Output" + + ```console + N E X T F L O W ~ version 25.04.3 + + Launching `./core-hello/main.nf` [insane_davinci] DSL2 - revision: b9e9b3b8de + + Downloading plugin nf-schema@2.5.1 + Input/output options + input : https://raw.githubusercontent.com/nf-core/test-datasets/viralrecon/samplesheet/samplesheet_test_illumina_amplicon.csv + outdir : core-hello-results + + Institutional config options + config_profile_name : Test profile + config_profile_description: Minimal test dataset to check pipeline function + + Generic options + trace_report_suffix : 2025-10-30_15-45-16 + + Core Nextflow options + runName : insane_davinci + containerEngine : docker + launchDir : /workspaces/training/hello-nf-core + workDir : /workspaces/training/hello-nf-core/work + projectDir : /workspaces/training/hello-nf-core/core-hello + userName : root + profile : docker,test + configFiles : /workspaces/training/hello-nf-core/core-hello/nextflow.config + + !! Only displaying parameters that differ from the pipeline defaults !! + ------------------------------------------------------ + -[core/hello] Pipeline completed successfully- + ``` This shows you that all the basic wiring is in place. You can take a look at the reports in the `pipeline_info` directory to see what was run; not much at all! @@ -202,7 +205,8 @@ You can take a look at the reports in the `pipeline_info` directory to see what ### 1.3. Examine the placeholder workflow If you look inside the `main.nf` file, you'll see it imports a workflow called `HELLO` from `workflows/hello`. -This is a placeholder workflow for our workflow of interest, with some nf-core functionality already in place. + +This is equivalent to the `workflows/demo.nf` workflow we encountered in Part 1, and serves as a placeholder workflow for our workflow of interest, with some nf-core functionality already in place. ```groovy title="core-hello/workflows/hello.nf" linenums="1" hl_lines="15 17 19 35" /* @@ -262,7 +266,7 @@ These are optional features of Nextflow that make the workflow **composable**, m !!! note "Composable workflows in depth" - The [Workflows of Workflows](../side_quests/workflows_of_workflows) side quest explores workflow composition in much greater depth, including how to compose multiple workflows together and manage complex data flows between them. We're introducing composability here because it's a fundamental requirement of the nf-core template architecture, which uses nested workflows to organize pipeline initialization, the main analysis workflow, and completion tasks into separate, reusable components. + The [Workflows of Workflows](../side_quests/workflows_of_workflows/) Side Quest explores workflow composition in much greater depth, including how to compose multiple workflows together and manage complex data flows between them. We're introducing composability here because it's a fundamental requirement of the nf-core template architecture, which uses nested workflows to organize pipeline initialization, the main analysis workflow, and completion tasks into separate, reusable components. We are going to need to plug the relevant logic from our workflow of interest into that structure. The first step for that is to make our original workflow composable. @@ -279,22 +283,45 @@ Learn how to make a simple workflow composable as a prelude to making it nf-core ## 2. Make the original Hello Nextflow workflow composable +Now it's time to get to work integrating our workflow into the nf-core scaffold. +As a reminder, we're working with the workflow featured in our [Hello Nextflow](../hello_nextflow/index.md) training course. + +??? example "What does the Hello Nextflow workflow do?" + + If you haven't done the [Hello Nextflow](../hello_nextflow/index.md) training, here's a quick overview of what this simple workflow does. + + The workflow takes a CSV file containing greetings, runs four consecutive transformation steps on them, and outputs a single text file containing an ASCII picture of a fun character saying the greetings. + + The four steps are implemented as Nextflow processes (`sayHello`, `convertToUpper`, `collectGreetings`, and `cowpy`) stored in separate module files. + + + + 1. **`sayHello`:** Writes each greeting to its own output file (e.g., "Hello-output.txt") + 2. **`convertToUpper`:** Converts each greeting to uppercase (e.g., "HELLO") + 3. **`collectGreetings`:** Collects all uppercase greetings into a single batch file + 4. **`cowpy`:** Generates ASCII art using the `cowpy` tool + +Importantly, the original Hello Nextflow was written as a simple unnamed workflow that can be run on its own. +In order to make it runnable from within a parent workflow as the nf-core template requires, we need to make it **composable**. + We provide you with a clean, fully functional copy of the completed Hello Nextflow workflow in the directory `original-hello` along with its modules and the default CSV file it expects to use as input. ```bash tree original-hello/ ``` -```console title="Output" -original-hello/ -├── hello.nf -├── modules -│ ├── collectGreetings.nf -│ ├── convertToUpper.nf -│ ├── cowpy.nf -│ └── sayHello.nf -└── nextflow.config -``` +??? example "Directory contents" + + ```console + original-hello/ + ├── hello.nf + ├── modules + │ ├── collectGreetings.nf + │ ├── convertToUpper.nf + │ ├── cowpy.nf + │ └── sayHello.nf + └── nextflow.config + ``` Feel free to run it to satisfy yourself that it works: @@ -302,20 +329,22 @@ Feel free to run it to satisfy yourself that it works: nextflow run original-hello/hello.nf ``` -```console title="Output" - N E X T F L O W ~ version 24.10.4 +??? example "Output" -Launching `original-hello/hello.nf` [goofy_babbage] DSL2 - revision: e9e72441e9 + ```console + N E X T F L O W ~ version 25.04.3 -executor > local (8) -[a4/081cec] sayHello (1) | 3 of 3 ✔ -[e7/7e9058] convertToUpper (3) | 3 of 3 ✔ -[0c/17263b] collectGreetings | 1 of 1 ✔ -[94/542280] cowpy | 1 of 1 ✔ -There were 3 greetings in this batch -``` + Launching `original-hello/hello.nf` [goofy_babbage] DSL2 - revision: e9e72441e9 + + executor > local (8) + [a4/081cec] sayHello (1) | 3 of 3 ✔ + [e7/7e9058] convertToUpper (3) | 3 of 3 ✔ + [0c/17263b] collectGreetings | 1 of 1 ✔ + [94/542280] cowpy | 1 of 1 ✔ + There were 3 greetings in this batch + ``` -Open the `hello.nf` workflow file to inspect the code, which is shown in full below (not counting the processes, which are in modules): +Now let's open the `hello.nf` workflow file to inspect the code, which is shown in full below (not counting the processes, which are in modules): ```groovy title="original-hello/hello.nf" linenums="1" #!/usr/bin/env nextflow @@ -512,7 +541,8 @@ That is going to be defined in the parent workflow, also called the **entrypoint ### 2.6. Make a dummy entrypoint workflow -We can make a dummy entrypoint workflow to test the composable workflow without yet having to deal with the rest of the complexity of the nf-core pipeline scaffold. +Before integrating our composable workflow into the complex nf-core scaffold, let's verify it works correctly. +We can make a simple dummy entrypoint workflow to test the composable workflow in isolation. Create a blank file named `main.nf` in the same`original-hello` directory. @@ -547,7 +577,7 @@ workflow { There are two important observations to make here: -- The syntax for calling the imported workflow (line 16) is essentially the same as the syntax for calling modules. +- The syntax for calling the imported workflow is essentially the same as the syntax for calling modules. - Everything that is related to pulling the inputs into the workflow (input parameter and channel construction) is now declared in this parent workflow. !!! note @@ -575,19 +605,21 @@ nextflow run ./original-hello If you made all the changes correctly, this should run to completion. -```console title="Output" - N E X T F L O W ~ version 24.10.4 +??? example "Output" -Launching `original-hello/main.nf` [friendly_wright] DSL2 - revision: 1ecd2d9c0a + ```console + N E X T F L O W ~ version 25.04.3 -executor > local (8) -[24/c6c0d8] HELLO:sayHello (3) | 3 of 3 ✔ -[dc/721042] HELLO:convertToUpper (3) | 3 of 3 ✔ -[48/5ab2df] HELLO:collectGreetings | 1 of 1 ✔ -[e3/693b7e] HELLO:cowpy | 1 of 1 ✔ -There were 3 greetings in this batch -Output: /workspaces/training/hello-nf-core/work/e3/693b7e48dc119d0c54543e0634c2e7/cowpy-COLLECTED-test-batch-output.txt -``` + Launching `original-hello/main.nf` [friendly_wright] DSL2 - revision: 1ecd2d9c0a + + executor > local (8) + [24/c6c0d8] HELLO:sayHello (3) | 3 of 3 ✔ + [dc/721042] HELLO:convertToUpper (3) | 3 of 3 ✔ + [48/5ab2df] HELLO:collectGreetings | 1 of 1 ✔ + [e3/693b7e] HELLO:cowpy | 1 of 1 ✔ + There were 3 greetings in this batch + Output: /workspaces/training/hello-nf-core/work/e3/693b7e48dc119d0c54543e0634c2e7/cowpy-COLLECTED-test-batch-output.txt + ``` This means we've successfully upgraded our HELLO workflow to be composable. @@ -595,10 +627,6 @@ This means we've successfully upgraded our HELLO workflow to be composable. You know how to make a workflow composable by giving it a name and adding `take`, `main` and `emit` statements, and how to call it from an entrypoint workflow. -!!! note - - If you're interested in digging deeper into options for composing workflows of workflows, check out the [Workflow of Workflows](https://training.nextflow.io/latest/side_quests/workflows_of_workflows) (a.k.a. WoW) side quest. - ### What's next? Learn how to graft a basic composable workflow onto the nf-core scaffold. @@ -607,10 +635,12 @@ Learn how to graft a basic composable workflow onto the nf-core scaffold. ## 3. Fit the updated workflow logic into the placeholder workflow -This is the current content of the `HELLO` workflow in `core-hello/workflows/hello.nf`. -Overall this code does very little aside from some housekeeping that has to do with capturing the version of any software tools that get run in the pipeline. +Now that we've verified our composable workflow works correctly, let's return to the nf-core pipeline scaffold we created in section 1. +We want to integrate the composable workflow we just developed into the nf-core template structure, so the end result should look something like this. + + -We need to add the relevant code from the version of the original workflow that we made composable. +So how do we make that happen? Let's have a look at the current content of the `HELLO` workflow in `core-hello/workflows/hello.nf` (the nf-core scaffold). ```groovy title="core-hello/workflows/hello.nf" linenums="1" /* @@ -659,6 +689,10 @@ workflow HELLO { */ ``` +Overall this code does very little aside from some housekeeping that has to do with capturing the version of any software tools that get run in the pipeline. + +We need to add the relevant code from the composable version of the original workflow that we developed in section 2. + We're going to tackle this in the following stages: 1. Copy over the modules and set up module imports @@ -672,9 +706,10 @@ We're going to tackle this in the following stages: ### 3.1. Copy the modules and set up module imports -In the original workflow, the four processes are stored in modules, so we need to copy those over to this new project (into a new `local` directory) and add import statements to the workflow file. +The four processes from our Hello Nextflow workflow are stored as modules in `original-hello/modules/`. +We need to copy those modules into the nf-core project structure (under `core-hello/modules/local/`) and add import statements to the nf-core workflow file. -First let's copy the module files over: +First let's copy the module files from `original-hello/` to `core-hello/`: ```bash mkdir -p core-hello/modules/local/ @@ -687,14 +722,16 @@ You should now see the directory of modules listed under `core-hello/`. tree core-hello/modules ``` -```console title="Output" -core-hello/modules -└── local - ├── collectGreetings.nf - ├── convertToUpper.nf - ├── cowpy.nf - └── sayHello.nf -``` +??? example "Directory contents" + + ```console + core-hello/modules + └── local + ├── collectGreetings.nf + ├── convertToUpper.nf + ├── cowpy.nf + └── sayHello.nf + ``` Now let's set up the module import statements. @@ -780,16 +817,23 @@ As a reminder, this is the relevant code in the original workflow, which didn't cowpy(collectGreetings.out.outfile, params.character) ``` -We need to copy this code into the new version of the workflow (minus the `main:` keyword which is already there). +We need to copy this code into the new version of the workflow, with a few modifications: + +- Omit the `main:` keyword (it's already there) +- Remove the `.view` line (line 776 above) - this was just for console output in the standalone version -There is already some code in there that has to do with capturing the versions of the tools that get run by the workflow. We're going to leave that alone for now (we'll deal with the tool versions later) and simply insert our code right after the `main:` line. +There is already some code in there that has to do with capturing the versions of the tools that get run by the workflow. We're going to leave that alone for now (we'll deal with the tool versions later). +We'll keep the `ch_versions = channel.empty()` initialization at the top, then insert our workflow logic, keeping the version collation code at the end. +This ordering makes sense because in a real pipeline, the processes would emit version information that would be added to the `ch_versions` channel as the workflow runs. === "After" - ```groovy title="core-hello/workflows/hello.nf" linenums="23" + ```groovy title="core-hello/workflows/hello.nf" linenums="23" hl_lines="5-16" main: + ch_versions = channel.empty() + // emit a greeting sayHello(greeting_ch) @@ -799,14 +843,9 @@ There is already some code in there that has to do with capturing the versions o // collect all the greetings into one file collectGreetings(convertToUpper.out.collect(), params.batch) - // emit a message about the size of the batch - collectGreetings.out.count.view { "There were $it greetings in this batch" } - // generate ASCII art of the greetings with cowpy cowpy(collectGreetings.out.outfile, params.character) - ch_versions = channel.empty() - // // Collate and save software versions // @@ -878,6 +917,9 @@ Finally, we need to update the `emit` block to include the declaration of the wo ``` This concludes the modifications we need to make to the HELLO workflow itself. +At this point, we have achieved the overall code structure we set out to implement. + + ### Takeaway @@ -885,13 +927,14 @@ You know how to fit the core pieces of a composable workflow into an nf-core pla ### What's next? -Learn how to adapt how the inputs are handle in the nf-core pipeline scaffold. +Learn how to adapt how the inputs are handled in the nf-core pipeline scaffold. --- ## 4. Adapt the input handling -Now that the HELLO workflow is ready to go, we need to adapt how the inputs are handled to make sure our `greetings.csv` will be handled appropriately. +Now that we've successfully integrated our workflow logic into the nf-core scaffold, we need to address one more critical piece: ensuring that our input data is processed correctly. +The nf-core template comes with sophisticated input handling designed for complex genomics datasets, so we need to adapt it to work with our simpler `greetings.csv` file. ### 4.1. Identify where inputs are handled @@ -998,7 +1041,7 @@ A bit of poking around reveals that the input handling is done by the `PIPELINE_ If we open up `core-hello/subworkflows/local/utils_nfcore_hello_pipeline/main.nf` and scroll down, we come to this chunk of code: -```groovy title="core-hello/subworkflows/local/utils_nfcore_hello_pipeline/main.nf" linenums="64" +```groovy title="core-hello/subworkflows/local/utils_nfcore_hello_pipeline/main.nf" linenums="76" // // Create channel from input file provided through params.input // @@ -1054,21 +1097,22 @@ As a reminder, this is what the channel construction looked like (as seen in the ```groovy title="solutions/composable-hello/main.nf" linenums="10" hl_lines="4" // create a channel for inputs from a CSV file greeting_ch = channel.fromPath(params.greeting) - .splitCsv() - .map { line -> line[0] } + .splitCsv() + .map { line -> line[0] } ``` So we just need to plug that into the initialisation workflow, with minor changes: we update the channel name from `greeting_ch` to `ch_samplesheet`, and the parameter name from `params.greeting` to `params.input` (see highlighted line). === "After" - ```groovy title="core-hello/subworkflows/local/utils_nfcore_hello_pipeline/main.nf" linenums="64" hl_lines="4" + ```groovy title="core-hello/subworkflows/local/utils_nfcore_hello_pipeline/main.nf" linenums="76" hl_lines="5-7" // // Create channel from input file provided through params.input // + ch_samplesheet = channel.fromPath(params.input) - .splitCsv() - .map { line -> line[0] } + .splitCsv() + .map { line -> line[0] } emit: samplesheet = ch_samplesheet @@ -1077,7 +1121,7 @@ So we just need to plug that into the initialisation workflow, with minor change === "Before" - ```groovy title="core-hello/subworkflows/local/utils_nfcore_hello_pipeline/main.nf" linenums="64" + ```groovy title="core-hello/subworkflows/local/utils_nfcore_hello_pipeline/main.nf" linenums="76" hl_lines="5-23" // // Create channel from input file provided through params.input // @@ -1116,10 +1160,12 @@ For now, we're focused on keeping it as simple as possible to get to something w Speaking of test data and parameters, let's update the test profile for this pipeline to use the `greetings.csv` mini-samplesheet instead of the example samplesheet provided in the template. -Under `core-hello/config`, we find two templated test profiles: `test.config` and `test_full.config`, which are meant to test a small data sample and a full-size one. +Under `core-hello/conf`, we find two templated test profiles: `test.config` and `test_full.config`, which are meant to test a small data sample and a full-size one. Given the purpose of our pipeline, there's not really a point to setting up a full-size test profile, so feel free to ignore or delete `test_full.config`. We're going to focus on setting up `test.config` to run on our `greetings.csv` file with a few default parameters. +#### 4.3.1. Copy over the `greetings.csv` file + First we need to copy the `greetings.csv` file to an appropriate place in our pipeline project. Typically small test files are stored in the `assets` directory, so let's copy the file over from our working directory. @@ -1127,36 +1173,40 @@ Typically small test files are stored in the `assets` directory, so let's copy t cp greetings.csv core-hello/assets/. ``` +Now the `greetings.csv` file is ready to be used as test input. + +#### 4.3.2. Update the `test.config` file + Now we can update the `test.config` file as follows: === "After" - ```groovy title="core-hello/conf/test.config" linenums="21" hl_lines="5-10" - params { - config_profile_name = 'Test profile' - config_profile_description = 'Minimal test dataset to check pipeline function' + ```groovy title="core-hello/conf/test.config" linenums="21" hl_lines="6-10" + params { + config_profile_name = 'Test profile' + config_profile_description = 'Minimal test dataset to check pipeline function' - // Input data - input = "${projectDir}/assets/greetings.csv" + // Input data + input = "${projectDir}/assets/greetings.csv" - // Other parameters - batch = 'test' - character = 'tux' - } + // Other parameters + batch = 'test' + character = 'tux' + } ``` === "Before" - ```groovy title="core-hello/conf/test.config" linenums="21" - params { - config_profile_name = 'Test profile' - config_profile_description = 'Minimal test dataset to check pipeline function' + ```groovy title="core-hello/conf/test.config" linenums="21" hl_lines="6-8" + params { + config_profile_name = 'Test profile' + config_profile_description = 'Minimal test dataset to check pipeline function' - // Input data - // TODO nf-core: Specify the paths to your test data on nf-core/test-datasets - // TODO nf-core: Give any required params for the test so that command line flags are not needed - input = params.pipelines_testdata_base_path + 'viralrecon/samplesheet/samplesheet_test_illumina_amplicon.csv' - } + // Input data + // TODO nf-core: Specify the paths to your test data on nf-core/test-datasets + // TODO nf-core: Give any required params for the test so that command line flags are not needed + input = params.pipelines_testdata_base_path + 'viralrecon/samplesheet/samplesheet_test_illumina_amplicon.csv' + } ``` Key points: @@ -1169,7 +1219,7 @@ And while we're at it, let's lower the default resource limitations: === "After" - ```groovy title="core-hello/conf/test.config" linenums="13" + ```groovy title="core-hello/conf/test.config" linenums="13" hl_lines="3 4" process { resourceLimits = [ cpus: 2, @@ -1181,7 +1231,7 @@ And while we're at it, let's lower the default resource limitations: === "Before" - ```groovy title="core-hello/conf/test.config" linenums="13" + ```groovy title="core-hello/conf/test.config" linenums="13" hl_lines="3 4" process { resourceLimits = [ cpus: 4, @@ -1204,42 +1254,44 @@ nextflow run core-hello --outdir core-hello-results -profile test,docker --valid If you've done all of the modifications correctly, it should run to completion. -```console title="Output" - N E X T F L O W ~ version 24.10.4 - -Launching `core-hello/main.nf` [agitated_noyce] DSL2 - revision: c31b966b36 - -Input/output options - input : core-hello/assets/greetings.csv - outdir : core-hello-results - -Institutional config options - config_profile_name : Test profile - config_profile_description: Minimal test dataset to check pipeline function - -Generic options - validate_params : false - trace_report_suffix : 2025-05-14_11-10-22 - -Core Nextflow options - runName : agitated_noyce - containerEngine : docker - launchDir : /workspaces/training/hello-nf-core - workDir : /workspaces/training/hello-nf-core/work - projectDir : /workspaces/training/hello-nf-core/core-hello - userName : root - profile : test,docker - configFiles : - -!! Only displaying parameters that differ from the pipeline defaults !! ------------------------------------------------------- -executor > local (8) -[d6/b59dca] CORE_HELLO:HELLO:sayHello (1) | 3 of 3 ✔ -[0b/42f9a1] CORE_HELLO:HELLO:convertToUpper (2) | 3 of 3 ✔ -[73/bec621] CORE_HELLO:HELLO:collectGreetings | 1 of 1 ✔ -[3f/e0a67a] CORE_HELLO:HELLO:cowpy | 1 of 1 ✔ --[core/hello] Pipeline completed successfully- -``` +??? example "Output" + + ```console + N E X T F L O W ~ version 25.04.3 + + Launching `core-hello/main.nf` [small_torvalds] DSL2 - revision: b9e9b3b8de + + Input/output options + input : /workspaces/training/hello-nf-core/core-hello/assets/greetings.csv + outdir : core-hello-results + + Institutional config options + config_profile_name : Test profile + config_profile_description: Minimal test dataset to check pipeline function + + Generic options + validate_params : false + trace_report_suffix : 2025-10-30_18-05-47 + + Core Nextflow options + runName : small_torvalds + containerEngine : docker + launchDir : /workspaces/training/hello-nf-core + workDir : /workspaces/training/hello-nf-core/work + projectDir : /workspaces/training/hello-nf-core/core-hello + userName : root + profile : test,docker + configFiles : /workspaces/training/hello-nf-core/core-hello/nextflow.config + + !! Only displaying parameters that differ from the pipeline defaults !! + ------------------------------------------------------ + executor > local (8) + [da/fe2e20] COR…LLO:sayHello (1) | 3 of 3 ✔ + [f5/4e47cf] COR…nvertToUpper (2) | 3 of 3 ✔ + [22/61caea] COR…collectGreetings | 1 of 1 ✔ + [a8/de5051] COR…ELLO:HELLO:cowpy | 1 of 1 ✔ + -[core/hello] Pipeline completed successfully- + ``` As you can see, this produced the typical nf-core summary at the start thanks to the initialisation subworkflow, and the lines for each module now show the full PIPELINE:WORKFLOW:module names. @@ -1254,47 +1306,53 @@ We didn't change anything to the modules themselves, so the outputs handled by m tree results ``` -```console title="Output" -results -├── Bonjour-output.txt -├── COLLECTED-test-batch-output.txt -├── COLLECTED-test-output.txt -├── cowpy-COLLECTED-test-batch-output.txt -├── cowpy-COLLECTED-test-output.txt -├── Hello-output.txt -├── Holà-output.txt -├── UPPER-Bonjour-output.txt -├── UPPER-Hello-output.txt -└── UPPER-Holà-output.txt -``` +??? example "Directory contents" + + ```console + results + ├── Bonjour-output.txt + ├── COLLECTED-test-batch-output.txt + ├── COLLECTED-test-output.txt + ├── cowpy-COLLECTED-test-batch-output.txt + ├── cowpy-COLLECTED-test-output.txt + ├── Hello-output.txt + ├── Holà-output.txt + ├── UPPER-Bonjour-output.txt + ├── UPPER-Hello-output.txt + └── UPPER-Holà-output.txt + ``` Anything that is hooked up to the nf-core template code gets put into a directory generated automatically, called `core-hello-results/`. -This includes the various reports produced by the nf-core utility subworkflows, which you can find under `core-hello-results/pipeline_info`. +This includes various execution reports and metadata that you can find under `core-hello-results/pipeline_info`. ```bash tree core-hello-results ``` -```console title="Output" -core-hello-results -└── pipeline_info - ├── execution_report_2025-06-03_18-22-28.html - ├── execution_report_2025-06-03_20-11-39.html - ├── execution_timeline_2025-06-03_18-22-28.html - ├── execution_timeline_2025-06-03_20-11-39.html - ├── execution_trace_2025-06-03_18-22-28.txt - ├── execution_trace_2025-06-03_20-10-11.txt - ├── execution_trace_2025-06-03_20-11-39.txt - ├── hello_software_versions.yml - ├── params_2025-06-03_18-22-32.json - ├── params_2025-06-03_20-10-15.json - ├── params_2025-06-03_20-11-43.json - ├── pipeline_dag_2025-06-03_18-22-28.html - └── pipeline_dag_2025-06-03_20-11-39.html -``` +??? example "Directory contents" + + ```console + core-hello-results + └── pipeline_info + ├── execution_report_2025-06-03_18-22-28.html + ├── execution_report_2025-06-03_20-11-39.html + ├── execution_timeline_2025-06-03_18-22-28.html + ├── execution_timeline_2025-06-03_20-11-39.html + ├── execution_trace_2025-06-03_18-22-28.txt + ├── execution_trace_2025-06-03_20-10-11.txt + ├── execution_trace_2025-06-03_20-11-39.txt + ├── hello_software_versions.yml + ├── params_2025-06-03_18-22-32.json + ├── params_2025-06-03_20-10-15.json + ├── params_2025-06-03_20-11-43.json + ├── pipeline_dag_2025-06-03_18-22-28.html + └── pipeline_dag_2025-06-03_20-11-39.html + ``` In our case, we didn't explicitly mark anything else as an output, so there's nothing else there. + + And there it is! It may seem like a lot of work to accomplish the same result as the original pipeline, but you do get all those lovely reports generated automatically, and you now have a solid foundation for taking advantage of additional features of nf-core, including input validation and some neat metadata handling capabilities that we'll cover in a later section. --- diff --git a/docs/hello_nf-core/03_use_module.md b/docs/hello_nf-core/03_use_module.md index 85118b8d8d..8a2547eea9 100644 --- a/docs/hello_nf-core/03_use_module.md +++ b/docs/hello_nf-core/03_use_module.md @@ -2,18 +2,20 @@ In this third part of the Hello nf-core training course, we show you how to find, install, and use an existing nf-core module in your pipeline. -One of the great advantages of nf-core pipelines is the ability to leverage pre-built, tested modules from the [nf-core/modules](https://github.com/nf-core/modules) repository. Rather than writing every process from scratch, you can install and use community-maintained modules that follow best practices. You can browse available modules at [nf-co.re/modules](https://nf-co.re/modules). +One of the great benefits of working with nf-core is the ability to leverage pre-built, tested modules from the [nf-core/modules](https://github.com/nf-core/modules) repository. +Rather than writing every process from scratch, you can install and use community-maintained modules that follow best practices. -In this section, we'll replace the custom `collectGreetings` module with the `cat/cat` module from nf-core/modules. +To demonstrate how this works, we'll replace the custom `collectGreetings` module with the `cat/cat` module from nf-core/modules in the `core-hello` pipeline. !!! note This section assumes you have completed [Part 2: Rewrite Hello for nf-core](./02_rewrite_hello.md) and have a working `core-hello` pipeline. + If you did not complete Part 2 or want to start fresh for this section, you can use the `core-hello-part2` solution as your starting point. - If you didn't complete Part 2 or want to start fresh for this section, you can use the `core-hello-part2` solution as your starting point: + Run this command from within the `hello-nf-core/` directory: ```bash - cp -r hello-nf-core/solutions/core-hello-part2 core-hello + cp -r solutions/core-hello-part2 core-hello cd core-hello ``` @@ -21,40 +23,52 @@ In this section, we'll replace the custom `collectGreetings` module with the `ca --- -## 1. Use an nf-core module +## 1. Find and install a suitable nf-core module -First, let's learn how to find, install, and use an existing nf-core module in our pipeline. +First, let's learn how to find an existing nf-core module and install it into our pipeline. -The `collectGreetings` process in our pipeline uses the Unix `cat` command to concatenate multiple greeting files into one. This is a perfect use case for the nf-core `cat/cat` module, which is designed specifically for concatenating files. +We'll aim to replace the `collectGreetings` process, which uses the Unix `cat` command to concatenate multiple greeting files into one. +Concatenating files is a very common operation, so it stands to reason that there might already be a module in nf-core designed for that purpose. -!!! note "Module naming convention" - - nf-core modules follow the naming convention `software/command`. The `cat/cat` module wraps the `cat` command from the `cat` software package. Other examples include `fastqc/fastqc` (FastQC software, fastqc command) or `samtools/view` (samtools software, view command). +Let's dive in. ### 1.1. Browse available modules on the nf-core website The nf-core project maintains a centralized catalog of modules at [https://nf-co.re/modules](https://nf-co.re/modules). -Navigate to the modules page in your web browser and use the search bar to search for "cat_cat". +Navigate to the modules page in your web browser and use the search bar to search for 'concatenate'. + +![module search results](./img/module-search-results.png) + +As you can see, there are quite a few results, many of them modules designed to concatenate very specific types of files. +Among them, you should see one called `cat_cat` that is general-purpose. + +!!! note "Module naming convention" + + The underscore (`_`) is used as a stand-in for the slash (`/`) character in module names. + + nf-core modules follow the naming convention `software/command` when a tool provides multiple commands, like `samtools/view` (samtools package, view command) or `gatk/haplotypecaller` (GATK package, HaplotypeCaller command). + For tools that provide only one main command, modules use a single level like `fastqc` or `multiqc`. -You should see `cat/cat` in the search results. Click on it to view the module documentation. +Click on the `cat_cat` module box to view the module documentation. The module page shows: -- A description: "A module for concatenation of gzipped or uncompressed files" +- A short description: "A module for concatenation of gzipped or uncompressed files" - Installation command: `nf-core modules install cat/cat` - Input and output channel structure - Available parameters ### 1.2. List available modules from the command line -You can also search for modules directly from the command line using nf-core tools. +Alternatively, you can also search for modules directly from the command line using nf-core tools. ```bash nf-core modules list remote ``` -This will display a list of all available modules in the nf-core/modules repository. You can scroll through or pipe to `grep` to find specific modules: +This will display a list of all available modules in the nf-core/modules repository, though it's a little less convenient if you don't already know the name of the module you're searching for. +However, if you do, you can pipe the list to `grep` to find specific modules: ```bash nf-core modules list remote | grep 'cat/cat' @@ -64,9 +78,11 @@ nf-core modules list remote | grep 'cat/cat' cat/cat ``` +Just keep in mind the that `grep` approach will only pull out results with the search term in their name, which would not work for `cat_cat`. + ### 1.3. Get detailed information about the module -To see detailed information about a specific module, use the `info` command: +To see detailed information about a specific module from the command line, use the `info` command: ```bash nf-core modules info cat/cat @@ -74,11 +90,64 @@ nf-core modules info cat/cat This displays documentation about the module, including its inputs, outputs, and basic usage information. -### 1.4. Install and verify the cat/cat module +??? example "Output" + + ```console + + ,--./,-. + ___ __ __ __ ___ /,-._.--~\ + |\ | |__ __ / ` / \ |__) |__ } { + | \| | \__, \__/ | \ |___ \`-._,-`-, + `._,._,' + + nf-core/tools version 3.4.1 - https://nf-co.re + + + ╭─ Module: cat/cat ─────────────────────────────────────────────────╮ + │ 🌐 Repository: https://github.com/nf-core/modules.git │ + │ 🔧 Tools: cat │ + │ 📖 Description: A module for concatenation of gzipped or │ + │ uncompressed files │ + ╰────────────────────────────────────────────────────────────────────╯ + ╷ ╷ + 📥 Inputs │Description │Pattern + ╺━━━━━━━━━━━━━━━━━┿━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┿━━━━━━━╸ + input[0] │ │ + ╶─────────────────┼──────────────────────────────────────────┼───────╴ + meta (map) │Groovy Map containing sample information │ + │e.g. [ id:'test', single_end:false ] │ + ╶─────────────────┼──────────────────────────────────────────┼───────╴ + files_in (file)│List of compressed / uncompressed files │ * + ╵ ╵ + ╷ ╷ + 📥 Outputs │Description │ Pattern + ╺━━━━━━━━━━━━━━━━━━━━━┿━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┿━━━━━━━━━━━━╸ + file_out │ │ + ╶─────────────────────┼─────────────────────────────────┼────────────╴ + meta (map) │Groovy Map containing sample │ + │information │ + ╶─────────────────────┼─────────────────────────────────┼────────────╴ + ${prefix} (file) │Concatenated file. Will be │ ${file_out} + │gzipped if file_out ends with │ + │".gz" │ + ╶─────────────────────┼─────────────────────────────────┼────────────╴ + versions │ │ + ╶─────────────────────┼─────────────────────────────────┼────────────╴ + versions.yml (file)│File containing software versions│versions.yml + ╵ ╵ + + 💻 Installation command: nf-core modules install cat/cat -!!! note + ``` + +This is the exact same information you can find on the website. + +### 1.4. Install the cat/cat module - Make sure you are in the `core-hello` directory (your pipeline root) in your terminal before running the module installation command. +Now that we've found the module we want, we need to add it to our pipeline's source code. + +The good news is that the nf-core project includes some tooling to make this part easy. +Specifically, the `nf-core modules install` command makes it possible to automate retrieving the code and making it available to your project in a single step. Navigate to your pipeline directory and run the installation command: @@ -87,12 +156,53 @@ cd core-hello nf-core modules install cat/cat ``` -The tool will prompt you to confirm the installation. Press Enter to accept the default options. +The tool will first prompt you to specify a repository type. + +??? example "Output" + + ```console + + ,--./,-. + ___ __ __ __ ___ /,-._.--~\ + |\ | |__ __ / ` / \ |__) |__ } { + | \| | \__, \__/ | \ |___ \`-._,-`-, + `._,._,' + + nf-core/tools version 3.4.1 - https://nf-co.re -```console title="Output" -INFO Installing 'cat/cat' -INFO Include statement: include { CAT_CAT } from '../modules/nf-core/cat/cat/main' -``` + + WARNING 'repository_type' not defined in .nf-core.yml + ? Is this repository a pipeline or a modules repository? (Use arrow keys) + » Pipeline + Modules repository + ``` + +Press enter to accept the default response (`Pipeline`) and continue. + +The tool will then offer to amend the configuration of your project to avoid this prompt in the future. + +??? example "Output" + + ```console + INFO To avoid this prompt in the future, add the 'repository_type' key to your .nf-core.yml file. + ? Would you like me to add this config now? [y/n] (y): + ``` + +Might as well take advantage of this convenient tooling! +Press enter to accept the default response (yes). + +Finally, the tool will proceed to install the module. + +??? example "Output" + + ```console + INFO Config added to '.nf-core.yml' + INFO Reinstalling modules found in 'modules.json' but missing from directory: + INFO Installing 'cat/cat' + INFO Use the following statement to include this module: + + include { CAT_CAT } from '../modules/nf-core/cat/cat/main' + ``` The command automatically: @@ -100,50 +210,84 @@ The command automatically: - Updates `modules.json` to track the installed module - Provides you with the correct `include` statement to use in your workflow +!!! note + + Always make sure your current working directory is the root of your pipeline project before running the module installation command. + Let's check that the module was installed correctly: ```bash -tree modules/nf-core/cat +tree -L 4 modules ``` -```console title="Output" -modules/nf-core/cat -└── cat - ├── environment.yml - ├── main.nf - ├── meta.yml - └── tests - ├── main.nf.test - ├── main.nf.test.snap - ├── nextflow.config - └── tags.yml -``` +??? example "Directory contents" + + ```console + modules + ├── local + │ ├── collectGreetings.nf + │ ├── convertToUpper.nf + │ ├── cowpy.nf + │ └── sayHello.nf + └── nf-core + └── cat + └── cat + ├── environment.yml + ├── main.nf + ├── meta.yml + └── tests + + 5 directories, 7 files + ``` -You can also verify the installation by listing locally installed modules: +You can also verify the installation by asking the nf-core utility to list locally installed modules: ```bash nf-core modules list local ``` ```console title="Output" +INFO Repository type: pipeline INFO Modules installed in '.': -┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ -┃ Module Name ┃ Repository ┃ -┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩ -│ cat/cat │ nf-core/modules │ -└────────────────────────────┴─────────────────────────────┘ +┏━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓ +┃ Module Name ┃ Repository ┃ Version SHA ┃ Message ┃ Date ┃ +┡━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩ +│ cat/cat │ nf-core/modules │ 41dfa3f │ update meta.yml of all modules (#8747) │ 2025-07-07 │ +└─────────────┴─────────────────┴─────────────┴────────────────────────────────────────┴────────────┘ ``` -### 1.5. Add the import statement to your workflow +This confirms that the `cat/cat` module is now part of your project's source code. -Open [core-hello/workflows/hello.nf](core-hello/workflows/hello.nf) and add the `include` statement for the `CAT_CAT` module in the imports section. +However, to actually use the new module, we need to import it into our pipeline. -The nf-core convention is to use uppercase for module names when importing them. +### 1.5. Update the module imports + +Let's replace the `include` statement for the `collectGreetings` module with the one for `CAT_CAT` in the imports section of the `workflows/hello.nf` workflow. + +!!! note + + You can optionally delete the `collectGreetings.nf` file: + + ```bash + rm modules/local/collectGreetings.nf + ``` + + However, you might want to keep it as a reference for understanding the differences between local and nf-core modules. + +As a reminder, the module install tool gave us the exact statement to use: + +```groovy title="Import statement produced by install command" +include { CAT_CAT } from '../modules/nf-core/cat/cat/main'` +``` + +Note that the nf-core convention is to use uppercase for module names when importing them. + +Open up [core-hello/workflows/hello.nf](core-hello/workflows/hello.nf) and make the following substitution: === "After" - ```groovy title="core-hello/workflows/hello.nf" linenums="1" hl_lines="12" + ```groovy title="core-hello/workflows/hello.nf" linenums="1" hl_lines="10" /* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ IMPORT MODULES / SUBWORKFLOWS / FUNCTIONS @@ -153,14 +297,13 @@ The nf-core convention is to use uppercase for module names when importing them. include { softwareVersionsToYAML } from '../subworkflows/nf-core/utils_nfcore_pipeline' include { sayHello } from '../modules/local/sayHello.nf' include { convertToUpper } from '../modules/local/convertToUpper.nf' - include { collectGreetings } from '../modules/local/collectGreetings.nf' - include { cowpy } from '../modules/local/cowpy.nf' include { CAT_CAT } from '../modules/nf-core/cat/cat/main' + include { cowpy } from '../modules/local/cowpy.nf' ``` === "Before" - ```groovy title="core-hello/workflows/hello.nf" linenums="1" + ```groovy title="core-hello/workflows/hello.nf" linenums="1" hl_lines="10" /* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ IMPORT MODULES / SUBWORKFLOWS / FUNCTIONS @@ -174,42 +317,44 @@ The nf-core convention is to use uppercase for module names when importing them. include { cowpy } from '../modules/local/cowpy.nf' ``` -Note how the path for the nf-core module differs from the local modules: +Notice how the path for the nf-core module differs from the local modules: -- **nf-core module**: `'../modules/nf-core/cat/cat/main'` (includes the tool name twice and references `main.nf`) +- **nf-core module**: `'../modules/nf-core/cat/cat/main'` (references `main.nf`) - **Local module**: `'../modules/local/collectGreetings.nf'` (single file reference) -### 1.6. Examine the cat/cat module interface +The module is now available to the workflow, so all we need to do is swap out the call to `collectGreetings` to use `CAT_CAT`. Right? -Let's look at the `cat/cat` module's main.nf file to understand its interface: +Not so fast. -```bash -head -30 modules/nf-core/cat/cat/main.nf -``` +At this point, you might be tempted to dive in and start editing code, but it's worth taking a moment to examine carefully what the new module expects and what it produces. -The key parts of the module are: +We're going to tackle that as a separate section because it involves a new mechanism we haven't covered yet: metadata maps. -```groovy title="modules/nf-core/cat/cat/main.nf (excerpt)" linenums="1" hl_lines="6 9" -process CAT_CAT { - tag "$meta.id" - label 'process_single' +### Takeaway - input: - tuple val(meta), path(files_in) +You know how to find an nf-core module and make it available to your project. - output: - tuple val(meta), path("${prefix}"), emit: file_out - path "versions.yml" , emit: versions -``` +### What's next? -The module expects: +Assess what a new module requires and identify any important changes needed in order to integrate it into a pipeline. -- **Input**: A tuple containing metadata (`meta`) and input file(s) (`files_in`) -- **Output**: A tuple containing metadata and the concatenated output file, plus a versions file +--- -### 1.7. Compare with collectGreetings interface +## 2. Assess the requirements of the new module -Our custom `collectGreetings` module has a simpler interface: +Specifically, we need to examine the **interface** of the module, i.e. its input and output definitions, and compare it to the interface of the module we're seeking to replace. +This will allow us to determine whether we can just treat the new module as a drop-in replacement or whether we'll need to adapt some of the wiring. + +Ideally this is something you should do _before_ you even install the module, but hey, better late than never. +(For what it's worth, there is an `uninstall` command to get rid of modules you decide you no longer want.) + +!!! note + + The CAT_CAT process includes some rather clever handling of different compression types, file extensions and so on that aren't strictly relevant to what we're trying to show you here, so we'll ignore most of it and focus only on the parts that are important. + +### 2.1. Compare the two modules' interfaces + +As a reminder, this is what the interface to our `collectGreetings` module looks like: ```groovy title="modules/local/collectGreetings.nf (excerpt)" linenums="1" hl_lines="6-7 10" process collectGreetings { @@ -224,17 +369,80 @@ process collectGreetings { path "COLLECTED-${batch_name}-output.txt" , emit: outfile ``` -The main differences are: +The `collectGreetings` module takes two inputs: -- `CAT_CAT` requires a metadata map (`tuple val(meta), path(files_in)`), while `collectGreetings` takes separate `path` and `val` inputs -- `CAT_CAT` outputs a tuple with metadata, while `collectGreetings` outputs a simple path -- `CAT_CAT` uses `meta.id` for the filename prefix, while `collectGreetings` uses the `batch_name` parameter +- `input_files` contains one or more input files to process; +- `batch_name` is a value that we use to assign a run-specific name to the output file, which is a form of metadata. -### 1.8. Understanding metadata maps +Upon completion, `collectGreetings` outputs a single file path, emitted with the `outfile` tag. -You've just seen that `CAT_CAT` expects inputs and outputs structured as tuples with metadata: +In comparison, the `cat/cat` module's interface is more complex: -```groovy +```groovy title="modules/nf-core/cat/cat/main.nf (excerpt)" linenums="1" hl_lines="11 14" +process CAT_CAT { + tag "$meta.id" + label 'process_low' + + conda "${moduleDir}/environment.yml" + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/pigz:2.3.4' : + 'biocontainers/pigz:2.3.4' }" + + input: + tuple val(meta), path(files_in) + + output: + tuple val(meta), path("${prefix}"), emit: file_out + path "versions.yml" , emit: versions +``` + +The CAT_CAT module takes a single input, but that input is a tuple containing two things: + +- `meta` is a structure containing metadata, called a metamap; +- `input_files` contains one or more input files to process, equivalent to `collectGreetings`'s `input_files`. + +Upon completion, CAT_CAT delivers its outputs in two parts: + +- Another tuple containing the metamap and the concatenated output file, emitted with the `file_out` tag; +- A `versions.yml` file that captures information about the software version that was used, emitted with the `versions` tag. + +Note also that by default, the output file will be named based on an identifier that is part of the metadata (code not shown here). + +This may seem like a lot to keep track of just looking at the code, so here's a diagram to help you visualize how everything fits together. + + + +You can see that the two modules have similar input requirements in terms of content (a set of input files plus some metadata) but very different expectations for how that content is packaged. +Ignoring the versions file for now, their main output is equivalent too (a concatenated file), except CAT_CAT also emits the metamap in conjunction with the output file. + +The packaging differences will be fairly easy to deal with, as you'll see in a little bit. +However, to understand the metamap part, we need to introduce you to some additional context. + +### 2.2. Understanding metamaps + +We just told you that the CAT_CAT module expects a metadata map as part of its input tuple. +Let's take a few minutes to take a closer look at what that is. + +The **metadata map**, often referred to as **metamap** for short, is a Groovy-style map containing information about units of data. +In the context of Nextflow pipelines, units of data can be anything you want: individual samples, batches of samples, or entire datasets. + +By convention, an nf-core metamap is named `meta` and contains the required field `id`, which is used for naming outputs and tracking units of data. + +For example, a typical metadata map might look like this: + +```groovy title="Example of sample-level metamap" +[id: 'sample1', single_end: false, strandedness: 'forward'] +``` + +Or in a case where the metadata is attached at the batch level: + +```groovy title="Example of batch-level metamap" +[id: 'batch1', date: '25.10.01'] +``` + +Now let's put this in the context of the `CAT_CAT` process, which expects the input files to be packaged into a tuple with a metamap, and outputs the metamap as part of the output tuple as well. + +```groovy title="modules/nf-core/cat/cat/main.nf (excerpt)" linenums="1" hl_lines="2 5" input: tuple val(meta), path(files_in) @@ -242,44 +450,94 @@ output: tuple val(meta), path("${prefix}"), emit: file_out ``` -This pattern is standard across all nf-core modules. The metadata map (commonly called `meta`) is a Groovy-style map containing information about a sample or dataset, with `id` being the required field used for naming outputs and tracking samples. +As a result, every unit of data travels through the pipeline with the relevant metadata attached. +Subsequent processes can then readily access that metadata too. -For example, a typical metadata map might look like: +Remember how we told you that the file output by `CAT_CAT` will be named based on an identifier that is part of the metadata? +This is the relevant code: -```groovy -[id: 'sample1', single_end: false, strandedness: 'forward'] +```groovy title="modules/nf-core/cat/cat/main.nf (excerpt)" linenums="35" +prefix = task.ext.prefix ?: "${meta.id}${getFileSuffix(file_list[0])}" ``` -In this tutorial, we use a simple metadata map with just the batch name: +This translates roughly as follows: if a `prefix` is provided via the external task parameter system (`task.ext`), use that to name the output file; otherwise create one using `${meta.id}`, which corresponds to the `id` field in the metamap. -```groovy -[id: 'test'] +You can imagine the input channel coming into this module with contents like this: + +```groovy title="Example input channel contents" +ch_input = [[[id: 'batch1', date: '25.10.01'], ['file1A.txt', 'file1B.txt']], + [[id: 'batch2', date: '25.10.26'], ['file2A.txt', 'file2B.txt']], + [[id: 'batch3', date: '25.11.14'], ['file3A.txt', 'file3B.txt']]] ``` -Why use metadata maps? +Then the output channel contents coming out like this: -- **Sample tracking**: Keep sample information with data throughout the workflow -- **Standardization**: All nf-core modules follow this pattern -- **Flexibility**: Easy to add custom metadata fields -- **Output naming**: Consistent file naming based on sample IDs +```groovy title="Example output channel contents" +ch_input = [[[id: 'batch1', date: '25.10.01'], 'batch1.txt'], + [[id: 'batch2', date: '25.10.26'], 'batch2.txt'], + [[id: 'batch3', date: '25.11.14'], 'batch3.txt']] +``` + +As mentioned earlier, the `tuple val(meta), path(files_in)` input setup is a standard pattern used across all nf-core modules. + +Hopefully you can start to see how useful this can be. +Not only does it allow you to name outputs based on metadata, but you can also do things like use it to apply different parameter values, and in combination with specific operators, you can even group, sort or filter out data as it flows through the pipeline. !!! note "Learn more about metadata" For a comprehensive introduction to working with metadata in Nextflow workflows, including how to read metadata from samplesheets and use it to customize processing, see the [Metadata in workflows](../side_quests/metadata) side quest. -For now, we'll pass the output from `CAT_CAT` to `cowpy` with the character parameter. In the next section, we'll adapt `cowpy` to follow nf-core conventions. +### 2.3. Summarize changes to be made + +Based on what we've reviewed, these are the major changes we need to make to our pipeline to utilize the `cat/cat` module: -### 1.9. Wire up CAT_CAT in the workflow +- Create a metamap containing the batch name; +- Package the metamap into a tuple with the set of input files to concatenate (coming out of `convertToUpper`); +- Switch the call from `collectGreetings()` to `CAT_CAT`; +- Extract the output file from the tuple produced by the `CAT_CAT` process before passing it to `cowpy`. -Now we need to modify our workflow code to use `CAT_CAT` instead of `collectGreetings`. Since `CAT_CAT` requires metadata tuples, we'll do this in several steps to make it clear how to work with metadata. +That should do the trick! Now that we've got a plan, we're ready to dive in. + +### Takeaway -Open [core-hello/workflows/hello.nf](core-hello/workflows/hello.nf) and make the following changes to the workflow logic in the `main` block. +You know how to assess the input and output interface of a new module to identify its requirements, and you've learned how metamaps are used by nf-core pipelines to keep metadata closely associated with the data as it flows through a pipeline. -#### Step 1: Create a metadata map +### What's next? + +Integrate the new module into a workflow. + +--- -First, we need to create a metadata map for `CAT_CAT`. Remember that nf-core modules require metadata with at least an `id` field. +## 3. Integrate CAT_CAT into the `hello.nf` workflow -Add these lines after the `convertToUpper` call, removing the `collectGreetings` call: +Now that you know everything about metamaps (or enough for the purposes of this course, anyway), it's time to actually implement the changes we outlined above. + +For the sake of clarity, we'll break this down and cover each step separately. + +!!! note + + All the changes shown below are made to the workflow logic in the `main` block in the `core-hello/workflows/hello.nf` workflow file. + +### 3.1. Create a metadata map + +First, we need to create a metadata map for `CAT_CAT`, keeping in mind that nf-core modules require the metamap to at least an `id` field. + +Since we don't need any other metadata, we can keep it simple and use something like this: + +```groovy title="Syntax example" +def cat_meta = [id: 'test'] +``` + +Except we don't want to hardcode the `id` value; we want to use the value of the `params.batch` parameter. +So the code becomes: + +```groovy title="Syntax example" +def cat_meta = [id: params.batch] +``` + +Yes, it is literally that simple to create a basic metamap. + +Let's add these lines after the `convertToUpper` call, removing the `collectGreetings` call: === "After" @@ -313,9 +571,9 @@ Add these lines after the `convertToUpper` call, removing the `collectGreetings` cowpy(collectGreetings.out.outfile, params.character) ``` -This creates a simple metadata map where the `id` is set to our batch name (which will be "test" when using the test profile). +This creates a simple metadata map where the `id` is set to our batch name (which will be `test` when using the test profile). -#### Step 2: Create a channel with metadata tuples +### 3.2. Create a channel with metadata tuples Next, transform the channel of files into a channel of tuples containing metadata and files: @@ -330,6 +588,7 @@ Next, transform the channel of files into a channel of tuples containing metadat // create metadata map with batch name as the ID def cat_meta = [ id: params.batch ] + // create a channel with metadata and files in tuple format ch_for_cat = convertToUpper.out.collect().map { files -> tuple(cat_meta, files) } @@ -353,14 +612,16 @@ Next, transform the channel of files into a channel of tuples containing metadat cowpy(collectGreetings.out.outfile, params.character) ``` -This line does two things: +The line we've added achieves two things: - `.collect()` gathers all files from the `convertToUpper` output into a single list - `.map { files -> tuple(cat_meta, files) }` creates a tuple of `[metadata, files]` in the format `CAT_CAT` expects -#### Step 3: Call CAT_CAT +That is all we need to do to set up the input tuple for `CAT_CAT`. -Now call `CAT_CAT` with the properly formatted channel: +### 3.3. Call the CAT_CAT module + +Now call `CAT_CAT` on the newly created channel: === "After" @@ -373,6 +634,8 @@ Now call `CAT_CAT` with the properly formatted channel: // create metadata map with batch name as the ID def cat_meta = [ id: params.batch ] + + // create a channel with metadata and files in tuple format ch_for_cat = convertToUpper.out.collect().map { files -> tuple(cat_meta, files) } // concatenate files using the nf-core cat/cat module @@ -393,15 +656,22 @@ Now call `CAT_CAT` with the properly formatted channel: // create metadata map with batch name as the ID def cat_meta = [ id: params.batch ] + + // create a channel with metadata and files in tuple format ch_for_cat = convertToUpper.out.collect().map { files -> tuple(cat_meta, files) } // generate ASCII art of the greetings with cowpy cowpy(collectGreetings.out.outfile, params.character) ``` -#### Step 4: Update cowpy to use CAT_CAT output +This completes the trickiest part of this substitution, but we're not quite done yet: we still need to update how we pass the concatenated output to the `cowpy` process. + +### 3.4. Extract the output file from the tuple for `cowpy` + +Previously, the `collectGreetings` process just produced a file that we could pass to `cowpy` directly. +However, the `CAT_CAT` process produces a tuple that includes the metamap in addition to the output file. -Finally, update the `cowpy` call to use the output from `CAT_CAT`. Since `cowpy` doesn't accept metadata tuples yet (we'll fix this in the next section), we need to extract just the file: +Since `cowpy` doesn't accept metadata tuples yet (we'll fix this in the next section), we need to extract the output file from the tuple produced by `CAT_CAT` before handing it to `cowpy`: === "After" @@ -414,13 +684,17 @@ Finally, update the `cowpy` call to use the output from `CAT_CAT`. Since `cowpy` // create metadata map with batch name as the ID def cat_meta = [ id: params.batch ] + + // create a channel with metadata and files in tuple format ch_for_cat = convertToUpper.out.collect().map { files -> tuple(cat_meta, files) } + // concatenate the greetings CAT_CAT(ch_for_cat) - // generate ASCII art of the greetings with cowpy // extract the file from the tuple since cowpy doesn't use metadata yet ch_for_cowpy = CAT_CAT.out.file_out.map{ meta, file -> file } + + // generate ASCII art of the greetings with cowpy cowpy(ch_for_cowpy, params.character) ``` @@ -435,80 +709,90 @@ Finally, update the `cowpy` call to use the output from `CAT_CAT`. Since `cowpy` // create metadata map with batch name as the ID def cat_meta = [ id: params.batch ] + + // create a channel with metadata and files in tuple format ch_for_cat = convertToUpper.out.collect().map { files -> tuple(cat_meta, files) } + // concatenate the greetings CAT_CAT(ch_for_cat) // generate ASCII art of the greetings with cowpy cowpy(collectGreetings.out.outfile, params.character) ``` -The `.map{ meta, file -> file }` operation extracts just the file from the `[metadata, file]` tuple that `CAT_CAT` outputs. + + +The `.map{ meta, file -> file }` operation extracts the file from the `[metadata, file]` tuple produced by `CAT_CAT` into a new channel, `ch_for_cowpy`. + +Then it's just a matter of passing `ch_for_cowpy` to `cowpy` instead of `collectGreetings.out.outfile`. !!! note - We're extracting just the file from `CAT_CAT`'s output tuple to pass to `cowpy`. In the next section, we'll update `cowpy` to work with metadata tuples directly, so this extraction step won't be necessary. + In the next section, we'll update `cowpy` to work with metadata tuples directly, so this extraction step will no longer be necessary. -### 1.10. Test the workflow +### 3.5. Test the workflow -Let's test that the workflow works with the `cat/cat` module: +Let's test that the workflow works with the newly integrated `cat/cat` module: ```bash nextflow run . --outdir core-hello-results -profile test,docker --validate_params false ``` -```console title="Output" - N E X T F L O W ~ version 25.04.3 - -Launching `./main.nf` [extravagant_volhard] DSL2 - revision: 6aa79210e6 - -Input/output options - input : /workspaces/training/hello-nf-core/nf-core-hello/assets/greetings.csv - outdir : core-hello-results - -Institutional config options - config_profile_name : Test profile - config_profile_description: Minimal test dataset to check pipeline function - -Generic options - validate_params : false - trace_report_suffix : 2025-10-17_19-51-31 - -Core Nextflow options - runName : extravagant_volhard - containerEngine : docker - launchDir : /workspaces/training/hello-nf-core/nf-core-hello - workDir : /workspaces/training/hello-nf-core/nf-core-hello/work - projectDir : /workspaces/training/hello-nf-core/nf-core-hello - userName : root - profile : test,docker - configFiles : /workspaces/training/hello-nf-core/nf-core-hello/nextflow.config - -!! Only displaying parameters that differ from the pipeline defaults !! ------------------------------------------------------- -executor > local (8) -[60/3ac109] NFCORE_HELLO:HELLO:sayHello (3) [100%] 3 of 3 ✔ -[58/073077] NFCORE_HELLO:HELLO:convertToUpper (3) [100%] 3 of 3 ✔ -[00/4f3d32] NFCORE_HELLO:HELLO:CAT_CAT (test) [100%] 1 of 1 ✔ -[98/afab8b] NFCORE_HELLO:HELLO:cowpy [100%] 1 of 1 ✔ --[nf-core/hello] Pipeline completed successfully- -``` +This should run reasonably quickly. + +??? example "Output" + + ```console + N E X T F L O W ~ version 25.04.3 + + Launching `./main.nf` [evil_pike] DSL2 - revision: b9e9b3b8de + + Input/output options + input : /workspaces/training/hello-nf-core/core-hello/assets/greetings.csv + outdir : core-hello-results + + Institutional config options + config_profile_name : Test profile + config_profile_description: Minimal test dataset to check pipeline function + + Generic options + validate_params : false + trace_report_suffix : 2025-10-30_18-50-58 + + Core Nextflow options + runName : evil_pike + containerEngine : docker + launchDir : /workspaces/training/hello-nf-core/core-hello + workDir : /workspaces/training/hello-nf-core/core-hello/work + projectDir : /workspaces/training/hello-nf-core/core-hello + userName : root + profile : test,docker + configFiles : /workspaces/training/hello-nf-core/core-hello/nextflow.config + + !! Only displaying parameters that differ from the pipeline defaults !! + ------------------------------------------------------ + executor > local (8) + [b3/f005fd] CORE_HELLO:HELLO:sayHello (3) [100%] 3 of 3 ✔ + [08/f923d0] CORE_HELLO:HELLO:convertToUpper (3) [100%] 3 of 3 ✔ + [34/3729a9] CORE_HELLO:HELLO:CAT_CAT (test) [100%] 1 of 1 ✔ + [24/df918a] CORE_HELLO:HELLO:cowpy [100%] 1 of 1 ✔ + -[core/hello] Pipeline completed successfully- + ``` Notice that `CAT_CAT` now appears in the process execution list instead of `collectGreetings`. +And that's it! We're now using a robust community-curated module instead of custom prototype-grade code for that step in the pipeline. + ### Takeaway You now know how to: - Find and install nf-core modules -- Understand metadata maps and why nf-core modules use them -- Create metadata structures to pass to nf-core modules -- Wire up nf-core modules in your workflow +- Assess the requirements of an nf-core module +- Create a simple metadata map for use with an nf-core module +- Integrate an nf-core module into your workflow ### What's next? -Adapt your local modules to follow nf-core conventions. - ---- - -In [Part 4](./04_adapt_module.md), we'll adapt your local `cowpy` module to follow nf-core conventions. +Learn to adapt your local modules to follow nf-core conventions. +We'll also show you how to create new nf-core modules from a template using the nf-core tooling. diff --git a/docs/hello_nf-core/04_adapt_module.md b/docs/hello_nf-core/04_adapt_module.md deleted file mode 100644 index 8b958d7345..0000000000 --- a/docs/hello_nf-core/04_adapt_module.md +++ /dev/null @@ -1,616 +0,0 @@ -# Part 4: Adapt local modules to nf-core conventions - -In this fourth part of the Hello nf-core training course, we show you how to adapt your local modules to follow nf-core conventions. - -Now that we've successfully integrated the nf-core `CAT_CAT` module in [Part 3](./03_use_module.md), let's adapt our local `cowpy` module to follow the same nf-core patterns. We'll do this incrementally, introducing one pattern at a time: - -1. First, we'll update `cowpy` to accept and propagate metadata tuples -2. Then, we'll simplify its interface using `ext.args` -3. Finally, we'll add configurable output naming with `ext.prefix` - -!!! note - - This section assumes you have completed [Part 3: Use an nf-core module](./03_use_module.md) and have integrated the `CAT_CAT` module into your pipeline. - - If you didn't complete Part 3 or want to start fresh for this section, you can use the `core-hello-part3` solution as your starting point: - - ```bash - cp -r hello-nf-core/solutions/core-hello-part3 core-hello - cd core-hello - ``` - - This gives you a pipeline with the `CAT_CAT` module already integrated. - ---- - -## 1. Adapt local modules to nf-core conventions - -### 1.1. Update cowpy to use metadata tuples - -Currently, we're extracting the file from `CAT_CAT`'s output tuple to pass to `cowpy`. It would be better to have `cowpy` accept metadata tuples directly, allowing metadata to flow through the entire workflow. - -Open [core-hello/modules/local/cowpy.nf](core-hello/modules/local/cowpy.nf) and modify it to accept metadata tuples: - -=== "After" - - ```groovy title="core-hello/modules/local/cowpy.nf" linenums="1" hl_lines="12 16" - #!/usr/bin/env nextflow - - // Generate ASCII art with cowpy (https://github.com/jeffbuttars/cowpy) - process cowpy { - - publishDir 'results', mode: 'copy' - - container 'community.wave.seqera.io/library/cowpy:1.1.5--3db457ae1977a273' - conda 'conda-forge::cowpy==1.1.5' - - input: - tuple val(meta), path(input_file) - val character - - output: - tuple val(meta), path("cowpy-${input_file}"), emit: cowpy_output - - script: - """ - cat $input_file | cowpy -c "$character" > cowpy-${input_file} - """ - } - ``` - -=== "Before" - - ```groovy title="core-hello/modules/local/cowpy.nf" linenums="1" hl_lines="12 16" - #!/usr/bin/env nextflow - - // Generate ASCII art with cowpy (https://github.com/jeffbuttars/cowpy) - process cowpy { - - publishDir 'results', mode: 'copy' - - container 'community.wave.seqera.io/library/cowpy:1.1.5--3db457ae1977a273' - conda 'conda-forge::cowpy==1.1.5' - - input: - path input_file - val character - - output: - path "cowpy-${input_file}" - - script: - """ - cat $input_file | cowpy -c "$character" > cowpy-${input_file} - """ - } - ``` - -Key changes: - -1. **Input**: Changed from `path input_file` to `tuple val(meta), path(input_file)` to accept metadata -2. **Output**: Changed to emit a tuple with metadata: `tuple val(meta), path("cowpy-${input_file}"), emit: cowpy_output` -3. **Named emit**: Added `emit: cowpy_output` to give the output channel a descriptive name - -Now update the workflow to pass the tuple directly instead of extracting the file. Open [core-hello/workflows/hello.nf](core-hello/workflows/hello.nf): - -=== "After" - - ```groovy title="core-hello/workflows/hello.nf" linenums="39" hl_lines="2" - // generate ASCII art of the greetings with cowpy - cowpy(CAT_CAT.out.file_out, params.character) - ``` - -=== "Before" - - ```groovy title="core-hello/workflows/hello.nf" linenums="39" hl_lines="2-4" - // generate ASCII art of the greetings with cowpy - // Extract the file from the tuple since cowpy doesn't use metadata yet - ch_for_cowpy = CAT_CAT.out.file_out.map{ meta, file -> file } - cowpy(ch_for_cowpy, params.character) - ``` - -Also update the emit block to use the named emit: - -=== "After" - - ```groovy title="core-hello/workflows/hello.nf" linenums="58" hl_lines="2" - emit: - cowpy_hellos = cowpy.out.cowpy_output - versions = ch_versions // channel: [ path(versions.yml) ] - ``` - -=== "Before" - - ```groovy title="core-hello/workflows/hello.nf" linenums="58" hl_lines="2" - emit: - cowpy_hellos = cowpy.out - versions = ch_versions // channel: [ path(versions.yml) ] - ``` - -Test the workflow to ensure metadata flows through correctly: - -```bash -nextflow run . --outdir core-hello-results -profile test,docker --validate_params false -``` - -The pipeline should run successfully with metadata now flowing from `CAT_CAT` through `cowpy`. - -### 1.2. Simplify the interface with ext.args - -Now let's address another nf-core pattern: simplifying module interfaces by using `ext.args` for optional command-line arguments. - -Currently, our `cowpy` module requires the `character` parameter to be passed as a separate input. While this works, nf-core modules follow a convention of keeping interfaces minimal - only essential inputs (metadata and files) should be declared. Optional tool arguments are instead passed via configuration. - -#### Understanding ext.args - -The `task.ext.args` pattern is an nf-core convention for passing optional command-line arguments to tools. Instead of adding multiple input parameters for every possible tool option, nf-core modules accept optional arguments through the `ext.args` configuration directive. - -Benefits of this approach: - -- **Minimal interface**: The module only requires essential inputs (metadata and files) -- **Flexibility**: Users can specify any tool arguments via configuration -- **Consistency**: All nf-core modules follow this pattern -- **Portability**: Modules can be reused in other pipelines without expecting specific parameter names -- **No workflow changes**: Adding new tool options doesn't require updating workflow code - -#### Update the module - -Let's update the cowpy module to use `ext.args` instead of the `character` input parameter. We'll also remove the local `publishDir` directive to rely on the centralized configuration in `modules.config`. - -!!! note "Why remove the local publishDir?" - - nf-core modules should not contain hardcoded `publishDir` directives. Instead, publishing is configured centrally in `conf/modules.config`. This provides several benefits: - - - **Single source of truth**: All output paths are configured in one place - - **Flexibility**: Users can easily customize where outputs are published - - **Consistency**: All modules follow the same publishing pattern - - **No conflicts**: Avoids having two separate publishing locations (local and centralized) - -Open [core-hello/modules/local/cowpy.nf](core-hello/modules/local/cowpy.nf): - -=== "After" - - ```groovy title="core-hello/modules/local/cowpy.nf" linenums="1" hl_lines="16 18" - #!/usr/bin/env nextflow - - // Generate ASCII art with cowpy (https://github.com/jeffbuttars/cowpy) - process cowpy { - - container 'community.wave.seqera.io/library/cowpy:1.1.5--3db457ae1977a273' - conda 'conda-forge::cowpy==1.1.5' - - input: - tuple val(meta), path(input_file) - - output: - tuple val(meta), path("cowpy-${input_file}"), emit: cowpy_output - - script: - def args = task.ext.args ?: '' - """ - cat $input_file | cowpy $args > cowpy-${input_file} - """ - } - ``` - -=== "Before" - - ```groovy title="core-hello/modules/local/cowpy.nf" linenums="1" hl_lines="6 13" - #!/usr/bin/env nextflow - - // Generate ASCII art with cowpy (https://github.com/jeffbuttars/cowpy) - process cowpy { - - publishDir 'results', mode: 'copy' - - container 'community.wave.seqera.io/library/cowpy:1.1.5--3db457ae1977a273' - conda 'conda-forge::cowpy==1.1.5' - - input: - tuple val(meta), path(input_file) - val character - - output: - tuple val(meta), path("cowpy-${input_file}"), emit: cowpy_output - - script: - """ - cat $input_file | cowpy -c "$character" > cowpy-${input_file} - """ - } - ``` - -Key changes: - -1. **Removed character input**: The module no longer requires `character` as a separate input -2. **Removed local publishDir**: Deleted the `publishDir 'results', mode: 'copy'` directive to rely on centralized configuration -3. **Added ext.args**: The line `def args = task.ext.args ?: ''` uses the Elvis operator (`?:`) to provide an empty string as default if `task.ext.args` is not set -4. **Updated command**: Changed from hardcoded `-c "$character"` to using the configurable `$args` - -The module interface is now simpler - it only accepts the essential metadata and file inputs. By removing the local `publishDir`, we follow the nf-core convention of centralizing all publishing configuration in `modules.config`. - -#### Configure ext.args - -Now we need to configure the `ext.args` to pass the character option. This allows us to keep the module interface simple while still providing the character option at the pipeline level. - -Open [core-hello/conf/modules.config](core-hello/conf/modules.config) and add the cowpy configuration: - -=== "After" - - ```groovy title="core-hello/conf/modules.config" linenums="13" hl_lines="6-8" - process { - publishDir = [ - path: { "${params.outdir}/${task.process.tokenize(':')[-1].tokenize('_')[0].toLowerCase()}" }, - ] - - withName: 'cowpy' { - ext.args = { "-c ${params.character}" } - } - } - ``` - -=== "Before" - - ```groovy title="core-hello/conf/modules.config" linenums="13" - process { - publishDir = [ - path: { "${params.outdir}/${task.process.tokenize(':')[-1].tokenize('_')[0].toLowerCase()}" }, - ] - } - ``` - -This configuration passes the `params.character` value to cowpy's `-c` flag. Note that we use a closure (`{ "-c ${params.character}" }`) to allow the parameter to be evaluated at runtime. - -Key points: - -- The **module interface stays simple** - it only accepts the essential metadata and file inputs -- The **pipeline still exposes `params.character`** - users can configure it as before -- The **module is now portable** - it can be reused in other pipelines without expecting a specific parameter name -- Configuration is **centralized** in `modules.config`, keeping workflow logic clean - -!!! note - - The `modules.config` file is where nf-core pipelines centralize per-module configuration. This separation of concerns makes modules more reusable across different pipelines. - -#### Update the workflow - -Since the cowpy module no longer requires the `character` parameter as an input, we need to update the workflow call. - -Open [core-hello/workflows/hello.nf](core-hello/workflows/hello.nf) and update the cowpy call: - -=== "After" - - ```groovy title="core-hello/workflows/hello.nf" linenums="39" hl_lines="2" - // generate ASCII art of the greetings with cowpy - cowpy(CAT_CAT.out.file_out) - ``` - -=== "Before" - - ```groovy title="core-hello/workflows/hello.nf" linenums="39" hl_lines="2" - // generate ASCII art of the greetings with cowpy - cowpy(CAT_CAT.out.file_out, params.character) - ``` - -The workflow code is now cleaner - we don't need to pass `params.character` directly to the process. The module interface is kept minimal, making it more portable, while the pipeline still provides the explicit option through configuration. - -#### Test - -Test that the workflow still works with the ext.args configuration. Let's specify a different character to verify the configuration is working: - -```bash -nextflow run . --outdir core-hello-results -profile test,docker --validate_params false --character cow -``` - -The pipeline should run successfully. In the output, look for the cowpy process execution line which will show something like: - -```console title="Output (excerpt)" -[f3/abc123] process > CORE_HELLO:HELLO:cowpy [100%] 1 of 1 ✔ -``` - -Now let's verify that the `ext.args` configuration actually passed the character argument to the cowpy command. Use the task hash (the `f3/abc123` part) to inspect the `.command.sh` file in the work directory: - -```bash -cat work/f3/abc123*/command.sh -``` - -You should see the cowpy command with the `-c cow` argument: - -```console title="Output" -#!/usr/bin/env bash -... -cat test.txt | cowpy -c cow > cowpy-test.txt -``` - -This confirms that `task.ext.args` successfully passed the character parameter through the configuration rather than requiring it as a process input. - -### 1.3. Add configurable output naming with ext.prefix - -There's one more nf-core pattern we can apply: using `ext.prefix` for configurable output file naming. - -#### Understanding ext.prefix - -The `task.ext.prefix` pattern is another nf-core convention for standardizing output file naming across modules while keeping it configurable. - -Benefits: - -- **Standardized naming**: Output files are typically named using sample IDs from metadata -- **Configurable**: Users can override the default naming if needed -- **Consistent**: All nf-core modules follow this pattern -- **Predictable**: Easy to know what output files will be called - -#### Update the module - -Let's update the cowpy module to use `ext.prefix` for output file naming. - -Open [core-hello/modules/local/cowpy.nf](core-hello/modules/local/cowpy.nf): - -=== "After" - - ```groovy title="core-hello/modules/local/cowpy.nf" linenums="1" hl_lines="13 17 19" - #!/usr/bin/env nextflow - - // Generate ASCII art with cowpy (https://github.com/jeffbuttars/cowpy) - process cowpy { - - container 'community.wave.seqera.io/library/cowpy:1.1.5--3db457ae1977a273' - conda 'conda-forge::cowpy==1.1.5' - - input: - tuple val(meta), path(input_file) - - output: - tuple val(meta), path("${prefix}.txt"), emit: cowpy_output - - script: - def args = task.ext.args ?: '' - prefix = task.ext.prefix ?: "${meta.id}" - """ - cat $input_file | cowpy $args > ${prefix}.txt - """ - } - ``` - -=== "Before" - - ```groovy title="core-hello/modules/local/cowpy.nf" linenums="1" hl_lines="13 18" - #!/usr/bin/env nextflow - - // Generate ASCII art with cowpy (https://github.com/jeffbuttars/cowpy) - process cowpy { - - container 'community.wave.seqera.io/library/cowpy:1.1.5--3db457ae1977a273' - conda 'conda-forge::cowpy==1.1.5' - - input: - tuple val(meta), path(input_file) - - output: - tuple val(meta), path("cowpy-${input_file}"), emit: cowpy_output - - script: - def args = task.ext.args ?: '' - """ - cat $input_file | cowpy $args > cowpy-${input_file} - """ - } - ``` - -Key changes: - -1. **Added ext.prefix**: `prefix = task.ext.prefix ?: "${meta.id}"` provides a configurable prefix with a sensible default (the sample ID) -2. **Updated output**: Changed from hardcoded `cowpy-${input_file}` to `${prefix}.txt` -3. **Updated command**: Uses the configured prefix for the output filename - -Note that the local `publishDir` has already been removed in the previous step, so we're continuing with the centralized configuration approach. - -#### Configure ext.prefix - -To maintain the same output file naming as before (`cowpy-.txt`), we can configure `ext.prefix` in modules.config. - -Update [core-hello/conf/modules.config](core-hello/conf/modules.config): - -=== "After" - - ```groovy title="core-hello/conf/modules.config" linenums="21" hl_lines="3" - withName: 'cowpy' { - ext.args = { "-c ${params.character}" } - ext.prefix = { "cowpy-${meta.id}" } - } - ``` - -=== "Before" - - ```groovy title="core-hello/conf/modules.config" linenums="21" - withName: 'cowpy' { - ext.args = { "-c ${params.character}" } - } - ``` - -Note that we use a closure (`{ "cowpy-${meta.id}" }`) which has access to `meta` because it's evaluated in the context of the process execution. - -!!! note - - The `ext.prefix` closure has access to `meta` because the configuration is evaluated in the context of the process execution, where metadata is available. - -#### Test and verify - -Test the workflow once more: - -```bash -nextflow run . --outdir core-hello-results -profile test,docker --validate_params false -``` - -Check the outputs: - -```bash -ls results/ -``` - -You should see the cowpy output files with the same naming as before: `cowpy-test.txt` (based on the batch name). This demonstrates how `ext.prefix` allows you to maintain your preferred naming convention while keeping the module interface flexible. - -If you wanted to change the naming (for example, to just `test.txt`), you would only need to modify the `ext.prefix` configuration - no changes to the module or workflow code would be required. - -### Takeaway - -You now know how to adapt local modules to follow nf-core conventions: - -- Update modules to accept and propagate metadata tuples -- Use `ext.args` to keep module interfaces minimal and portable -- Use `ext.prefix` for configurable, standardized output file naming -- Configure process-specific parameters through `modules.config` - -### What's next? - -Clean up by optionally removing the now-unused local module. - ---- - -### 1.4. Optional: Clean up unused local modules - -Now that we're using the nf-core `cat/cat` module, the local `collectGreetings` module is no longer needed. - -Remove or comment out the import line for `collectGreetings` in [core-hello/workflows/hello.nf](core-hello/workflows/hello.nf): - -```groovy title="core-hello/workflows/hello.nf" linenums="10" -include { sayHello } from '../modules/local/sayHello.nf' -include { convertToUpper } from '../modules/local/convertToUpper.nf' -// include { collectGreetings } from '../modules/local/collectGreetings.nf' // No longer needed -include { cowpy } from '../modules/local/cowpy.nf' -include { CAT_CAT } from '../modules/nf-core/cat/cat/main' -``` - -You can optionally delete the `collectGreetings.nf` file: - -```bash -rm modules/local/collectGreetings.nf -``` - -However, you might want to keep it as a reference for understanding the differences between local and nf-core modules. - ---- - -## 2. Creating modules with nf-core tooling - -In this tutorial, we manually adapted the `cowpy` module step-by-step to teach the nf-core conventions. However, **in practice, you'd use the nf-core tooling to generate properly structured modules from the start**. - -### 2.1. Using nf-core modules create - -The `nf-core modules create` command generates a module template that already follows all the conventions you've learned: - -```bash -# In the nf-core/modules repository -nf-core modules create tool/subtool -``` - -For example, to create the `cowpy` module: - -```bash -nf-core modules create cowpy -``` - -The command will interactively ask for: - -- Tool name and optional subtool/subcommand -- Author information -- Resource requirements (CPU/memory estimates) - -#### What gets generated - -The tool creates a complete module structure: - -```console -modules/nf-core/cowpy/ -├── main.nf # Process definition with TODO comments -├── meta.yml # Module documentation -├── environment.yml # Conda environment -└── tests/ - ├── main.nf.test # nf-test test cases - └── tags.yml # Test tags -``` - -The generated `main.nf` includes all the patterns automatically: - -```groovy -process COWPY { - tag "$meta.id" - label 'process_single' - - conda "${moduleDir}/environment.yml" - container "..." - - input: - tuple val(meta), path(input_file) // Metadata tuples ✓ - - output: - tuple val(meta), path("${prefix}.*"), emit: output // Metadata propagation ✓ - path "versions.yml" , emit: versions - - script: - def args = task.ext.args ?: '' // ext.args pattern ✓ - def prefix = task.ext.prefix ?: "${meta.id}" // ext.prefix pattern ✓ - """ - # TODO: Add your command here - cowpy $args < $input_file > ${prefix}.txt - - cat <<-END_VERSIONS > versions.yml - "${task.process}": - cowpy: \$(cowpy --version | sed 's/cowpy //') - END_VERSIONS - """ -} -``` - -You fill in the command logic and the module is ready to test! - -### 2.2. Contributing modules back to nf-core - -The [nf-core/modules](https://github.com/nf-core/modules) repository welcomes contributions of well-tested, standardized modules. - -#### Why contribute? - -Contributing your modules to nf-core: - -- Makes your tools available to the entire nf-core community through the modules catalog at [nf-co.re/modules](https://nf-co.re/modules) -- Ensures ongoing community maintenance and improvements -- Provides quality assurance through code review and automated testing -- Gives your work visibility and recognition - -#### Contributing workflow - -To contribute a module to nf-core: - -1. Check if it already exists at [nf-co.re/modules](https://nf-co.re/modules) -2. Fork the [nf-core/modules](https://github.com/nf-core/modules) repository -3. Use `nf-core modules create` to generate the template -4. Fill in the module logic and tests -5. Test with `nf-core modules test tool/subtool` -6. Lint with `nf-core modules lint tool/subtool` -7. Submit a pull request - -For detailed instructions, see the [nf-core components tutorial](https://nf-co.re/docs/tutorials/nf-core_components/components). - -#### Resources - -- **Components tutorial**: [Complete guide to creating and contributing modules](https://nf-co.re/docs/tutorials/nf-core_components/components) -- **Module specifications**: [Technical requirements and guidelines](https://nf-co.re/docs/guidelines/components/modules) -- **Community support**: [nf-core Slack](https://nf-co.re/join) - Join the `#modules` channel - ---- - -## Takeaway - -You now understand the key patterns that make nf-core modules portable and maintainable: - -- **Metadata tuples** track sample information through the workflow -- **`ext.args`** simplifies module interfaces by handling optional arguments via configuration -- **`ext.prefix`** standardizes output file naming -- **Centralized configuration** in `modules.config` keeps modules reusable - -You learned these patterns by manually adapting a local module, which gives you the foundation to understand and debug modules. In practice, you'll use `nf-core modules create` to generate properly structured modules from the start. - -Finally, you learned how to contribute modules to the nf-core community, making tools available to researchers worldwide while benefiting from ongoing community maintenance. - -## What's next? - -Continue to [Part 5: Input validation](./05_input_validation.md) to learn how to add schema-based input validation to your pipeline, or explore other nf-core modules you might add to enhance your pipeline further. diff --git a/docs/hello_nf-core/04_make_module.md b/docs/hello_nf-core/04_make_module.md new file mode 100644 index 0000000000..ec18a517ab --- /dev/null +++ b/docs/hello_nf-core/04_make_module.md @@ -0,0 +1,1123 @@ +# Part 4: Make an nf-core module + +In this fourth part of the Hello nf-core training course, we show you how to create an nf-core module by applying the key conventions that make modules portable and maintainable. + +The nf-core project provides a command (`nf-core modules create`) that generates properly structured module templates automatically, similar to what we used for the workflow in Part 2. +However, for teaching purposes, we're going to start by doing it manually: transforming the local `cowpy` module in your `core-hello` pipeline into an nf-core-style module step-by-step. +After that, we'll show you how to use the template-based module creation to work more efficiently in the future. + +!!! note + + This section assumes you have completed [Part 3: Use an nf-core module](./03_use_module.md) and have integrated the `CAT_CAT` module into your pipeline. + + If you didn't complete Part 3 or want to start fresh for this section, you can use the `core-hello-part3` solution as your starting point: + + ```bash + cp -r hello-nf-core/solutions/core-hello-part3 core-hello + cd core-hello + ``` + + This gives you a pipeline with the `CAT_CAT` module already integrated. + +--- + +## 1. Transform `cowpy` into an nf-core module + +In this section, we'll apply nf-core conventions to the local `cowpy` module in your `core-hello` pipeline, transforming it into a module that follows community standards. + +We'll apply the following nf-core conventions incrementally: + +1. **Update `cowpy` to use metadata tuples** to propagate sample metadata through the workflow. +2. **Centralize tool argument configuration with `ext.args`** to increase module versatility while keeping the interface minimal. +3. **Standardize output naming with `ext.prefix`** to promote consistency. +4. **Centralize the publishing configuration** to promote consistency. + +After each step, we'll run the pipeline to test that everything works as expected. + +!!! tip "Working directory" + + Make sure you're in the `core-hello` directory (your pipeline root) for all the commands and file edits in this section. + + ```bash + cd core-hello + ``` + +### 1.1. Update `cowpy` to use metadata tuples + +In the current version of the `core-hello` pipeline, we're extracting the file from `CAT_CAT`'s output tuple to pass to `cowpy`. + + + +It would be better to have `cowpy` accept metadata tuples directly, allowing metadata to flow on through the workflow. +To the end, we'll need to make the following changes: + +1. Update the input and output definitions +2. Update the process call in the workflow +3. Update the emit block in the workflow + +Once we've done all that, we'll run the pipeline to test that everything still works as before. + +#### 1.1.1. Update the input and output definitions + +Let's get started! +Open the `cowpy.nf` module file (under `core-hello/modules/local/`) and modify it to accept metadata tuples as shown below. + +=== "After" + + ```groovy title="core-hello/modules/local/cowpy.nf" linenums="1" hl_lines="12 16" + #!/usr/bin/env nextflow + + // Generate ASCII art with cowpy (https://github.com/jeffbuttars/cowpy) + process cowpy { + + publishDir 'results', mode: 'copy' + + container 'community.wave.seqera.io/library/cowpy:1.1.5--3db457ae1977a273' + conda 'conda-forge::cowpy==1.1.5' + + input: + tuple val(meta), path(input_file) + val character + + output: + tuple val(meta), path("cowpy-${input_file}"), emit: cowpy_output + + script: + """ + cat $input_file | cowpy -c "$character" > cowpy-${input_file} + """ + } + ``` + +=== "Before" + + ```groovy title="core-hello/modules/local/cowpy.nf" linenums="1" hl_lines="12 16" + #!/usr/bin/env nextflow + + // Generate ASCII art with cowpy (https://github.com/jeffbuttars/cowpy) + process cowpy { + + publishDir 'results', mode: 'copy' + + container 'community.wave.seqera.io/library/cowpy:1.1.5--3db457ae1977a273' + conda 'conda-forge::cowpy==1.1.5' + + input: + path input_file + val character + + output: + path "cowpy-${input_file}" + + script: + """ + cat $input_file | cowpy -c "$character" > cowpy-${input_file} + """ + } + ``` + +As you can see, we changed both the **main input** and the **output** to a tuple that follows the `tuple val(meta), path(input_file)` pattern introduced in Part 3 of this training. +For the output, we also took this opportunity to add `emit: cowpy_output` in order to give the output channel a descriptive name. + +Now that we've changed what the process expects, we need to update what we provide to it in the process call. + +#### 1.1.2. Update the process call in the workflow + +The good news is that this change will simplify the process call. +Now that the output of `CAT_CAT` and the input of `cowpy` are the same 'shape', i.e. they both consist of a `tuple val(meta), path(input_file)` structure, we can simply connect them directly instead of having to extract the file explicitly from the output of the `CAT_CAT` process. + +Open the `hello.nf` workflow file (under `core-hello/workflows/`) and update the call to `cowpy` as shown below. + +=== "After" + + ```groovy title="core-hello/workflows/hello.nf" linenums="43" hl_lines="2" + // generate ASCII art of the greetings with cowpy + cowpy(CAT_CAT.out.file_out, params.character) + ``` + +=== "Before" + + ```groovy title="core-hello/workflows/hello.nf" linenums="43" hl_lines="5" + // extract the file from the tuple since cowpy doesn't use metadata yet + ch_for_cowpy = CAT_CAT.out.file_out.map{ meta, file -> file } + + // generate ASCII art of the greetings with cowpy + cowpy(ch_for_cowpy, params.character) + ``` + +We now call `cowpy` on `CAT_CAT.out.file_out` directly. + +As a result, we no longer need to construct the `ch_for_cowpy` channel, so that line (and its comment line) can be deleted entirely. + +#### 1.1.3. Update the emit block in the workflow + +Since `cowpy` now emits a named output, `cowpy_output`, we can update the `hello.nf` workflow's `emit:` block to use that. + +=== "After" + + ```groovy title="core-hello/workflows/hello.nf" linenums="60" hl_lines="2" + emit: + cowpy_hellos = cowpy.out.cowpy_output + versions = ch_versions // channel: [ path(versions.yml) ] + ``` + +=== "Before" + + ```groovy title="core-hello/workflows/hello.nf" linenums="60" hl_lines="2" + emit: + cowpy_hellos = cowpy.out + versions = ch_versions // channel: [ path(versions.yml) ] + ``` + +This is technically not required, but it's good practice to refer to named outputs whenever possible. + +#### 1.1.4. Run the pipeline to test it + +Let's run the workflow to test that everything is working correctly after these changes. + +```bash +nextflow run . --outdir core-hello-results -profile test,docker --validate_params false +``` + +The pipeline should run successfully, with metadata now flowing from `CAT_CAT` through `cowpy`: + +```console title="Output (excerpt)" +executor > local (8) +[b2/4cf633] CORE_HELLO:HELLO:sayHello (2) [100%] 3 of 3 ✔ +[ed/ef4d69] CORE_HELLO:HELLO:convertToUpper (3) [100%] 3 of 3 ✔ +[2d/32c93e] CORE_HELLO:HELLO:CAT_CAT (test) [100%] 1 of 1 ✔ +[da/6f3246] CORE_HELLO:HELLO:cowpy [100%] 1 of 1 ✔ +-[core/hello] Pipeline completed successfully- +``` + +That completes what we needed to do to make `cowpy` handle metadata tuples. +Now, let's look at what else we can do to take advantage of nf-core module patterns. + +### 1.2. Centralize tool argument configuration with `ext.args` + +In its current state, the `cowpy` process expects to receive a value for the `character` parameter. +As a result, we have to provide a value every time we call the process, even if we'd be happy with the defaults set by the tool. +For `cowpy` this is admittedly not a big problem, but for tools with many optional parameters, it can get quite cumbersome. + +The nf-core project recommends using a Nextflow feature called `ext.args` to manage tool arguments more conveniently via configuration files. + +Instead of declaring process inputs for every tool option, you write the module to reference `ext.args` in the construction of its command line. +Then it's just a matter of setting up the `ext.args` variable to hold the arguments and values you want to use in the `modules.config` file, which consolidates configuration details for all modules. +Nextflow will add those arguments with their values into the tool command line at runtime. + +Let's apply this approach to the `cowpy` module. +We're going to need to make the following changes: + +1. Update the `cowpy` module +2. Configure `ext.args` in the `modules.config` file +3. Update the `hello.nf` workflow + +Once we've done all that, we'll run the pipeline to test that everything still works as before. + +#### 1.2.1. Update the `cowpy` module + +Let's do it. +Open the `cowpy.nf` module file (under `core-hello/modules/local/`) and modify it to reference `ext.args` as shown below. + +=== "After" + + ```groovy title="modules/local/cowpy.nf" linenums="1" hl_lines="16 18" + #!/usr/bin/env nextflow + + // Generate ASCII art with cowpy (https://github.com/jeffbuttars/cowpy) + process cowpy { + + publishDir 'results', mode: 'copy' + + container 'community.wave.seqera.io/library/cowpy:1.1.5--3db457ae1977a273' + conda 'conda-forge::cowpy==1.1.5' + + input: + tuple val(meta), path(input_file) + + output: + tuple val(meta), path("cowpy-${input_file}"), emit: cowpy_output + + script: + def args = task.ext.args ?: '' + """ + cat $input_file | cowpy $args > cowpy-${input_file} + """ + } + ``` + +=== "Before" + + ```groovy title="core-hello/modules/local/cowpy.nf" linenums="1" hl_lines="6 13" + #!/usr/bin/env nextflow + + // Generate ASCII art with cowpy (https://github.com/jeffbuttars/cowpy) + process cowpy { + + publishDir 'results', mode: 'copy' + + container 'community.wave.seqera.io/library/cowpy:1.1.5--3db457ae1977a273' + conda 'conda-forge::cowpy==1.1.5' + + input: + tuple val(meta), path(input_file) + val character + + output: + tuple val(meta), path("cowpy-${input_file}"), emit: cowpy_output + + script: + """ + cat $input_file | cowpy -c "$character" > cowpy-${input_file} + """ + } + ``` + +You can see we made three changes. + +1. **In the `input:` block, we removed the `val character` input.** + Going forward, we'll supply that argument via the `ext.args` configuration as described further below. + +2. **In the `script:` block, we added the line `def args = task.ext.args ?: ''`.** + That line uses the `?:` operator to determine the value of the `args` variable: the content of `task.ext.args` if it is not empty, or an empty string if it is. + Note that while we generally refer to `ext.args`, this code must reference `task.ext.args` to pull out the module-level `ext.args` configuration. + +3. **In the command line, we replaced `-c "$character"` with `$args`.** + This is where Nextflow will inject any tool arguments set in `ext.args` in the `modules.config` file. + +As a result, the module interface is now simpler: it only expects the essential metadata and file inputs. + +!!! note + + The `?:` operator is often called the 'Elvis operator' because it looks like a sideways Elvis Presley face, with the `?` character symbolizing the wave in his hair. + +#### 1.2.2. Configure `ext.args` in the `modules.config` file + +Now that we've taken the `character` declaration out of the module, we've got to add it to `ext.args` in the `modules.config` configuration file. + +Specifically, we're going to add this little chunk of code to the `process {}` block: + +```groovy title="Code to add" +withName: 'cowpy' { + ext.args = { "-c ${params.character}" } +} +``` + +The `withName:` syntax assigns this configuration to the `cowpy` process only, and `ext.args = { "-c ${params.character}" }` simply composes a string that will include the value of the `character` parameter. +Note the use of curly braces, which tell Nextflow to evaluate the value of the parameter at runtime. + +Makes sense? Let's add it in. + +Open `conf/modules.config` and add the configuration code inside the `process {}` block as shown below. + +=== "After" + + ```groovy title="core-hello/conf/modules.config" linenums="13" hl_lines="6-8" + process { + publishDir = [ + path: { "${params.outdir}/${task.process.tokenize(':')[-1].tokenize('_')[0].toLowerCase()}" }, + ] + + withName: 'cowpy' { + ext.args = { "-c ${params.character}" } + } + } + ``` + +=== "Before" + + ```groovy title="core-hello/conf/modules.config" linenums="13" + process { + publishDir = [ + path: { "${params.outdir}/${task.process.tokenize(':')[-1].tokenize('_')[0].toLowerCase()}" }, + ] + } + ``` + +Hopefully you can imagine having all the modules in a pipeline have their `ext.args` specified in this file, with the following benefits: + +- The **module interface stays simple** - It only accepts the essential metadata and file inputs +- The **pipeline still exposes `params.character`** - End-users can still configure it as before +- The **module is now portable** - It can be reused in other pipelines without expecting a specific parameter name +- The configuration is **centralized** in `modules.config`, keeping workflow logic clean + +By using the `modules.config` file as the place where all pipelines centralize per-module configuration, we make our modules more reusable across different pipelines. + +#### 1.2.3. Update the `hello.nf` workflow + +Since the cowpy module no longer requires the `character` parameter as an input, we need to update the workflow call accordingly. + +Open the `hello.nf` workflow file (under `core-hello/workflows/`) and update the call to `cowpy` as shown below. + +=== "After" + + ```groovy title="core-hello/workflows/hello.nf" linenums="39" hl_lines="2" + // generate ASCII art of the greetings with cowpy + cowpy(CAT_CAT.out.file_out) + ``` + +=== "Before" + + ```groovy title="core-hello/workflows/hello.nf" linenums="39" hl_lines="2" + // generate ASCII art of the greetings with cowpy + cowpy(CAT_CAT.out.file_out, params.character) + ``` + +The workflow code is now cleaner: we don't need to pass `params.character` directly to the process. +The module interface is kept minimal, making it more portable, while the pipeline still provides the explicit option through configuration. + +#### 1.2.4. Run the pipeline to test it + +Let's test that the workflow still works as expected, specifying a different character to verify that the `ext.args` configuration is working. + +Run this command using `kosh`, one of the more... enigmatic options: + +```bash +nextflow run . --outdir core-hello-results -profile test,docker --validate_params false --character kosh +``` + +The pipeline should run successfully. +In the output, look for the cowpy process execution line, which will show something like this: + +```console title="Output (excerpt)" +[bd/0abaf8] CORE_HELLO:HELLO:cowpy [100%] 1 of 1 ✔ +``` + +So it ran successfully, great! +Now let's verify that the `ext.args` configuration worked by checking the output. +Find the output in the file browser or use the task hash (the `bd/0abaf8` part in the example above) to look at the output file: + +```bash +cat work/bd/0abaf8*/cowpy-test.txt +``` + +??? example "Output" + + ```console + _________ + / HELLO \ + | HOLà | + \ BONJOUR / + --------- + \ + \ + \ + ___ _____ ___ + / \ / /| / \ + | | / / | | | + | | /____/ | | | + | | | | | | | + | | | {} | / | | + | | |____|/ | | + | | |==| | | + | \___________/ | + | | + | | + ``` + +You should see the ASCII art displayed with the `kosh` character, confirming that the `ext.args` configuration worked! + +!!! note "Optional: Inspect the command file" + + If you want to see exactly how the configuration was applied, you can inspect the `.command.sh` file: + + ```bash + cat work/bd/0abaf8*/.command.sh + ``` + + You'll see the `cowpy` command with the `-c kosh` argument: + + ```console + #!/usr/bin/env bash + ... + cat test.txt | cowpy -c kosh > cowpy-test.txt + ``` + + This shows that the `.command.sh` file was generated correctly based on the `ext.args` configuration. + +Take a moment to think about what we achieved here. +This approach keeps the module interface focused on essential data (files, metadata, and any mandatory per-sample parameters), while options that control the behavior of the tool are handled separately through configuration. + +This may seem unnecessary for a simple tool like `cowpy`, but it can make a big difference for data analysis tools that have a lot of optional arguments. + +To summarize the benefits of this approach: + +- **Clean interface**: The module focuses on essential data inputs (metadata and files) +- **Flexibility**: Users can specify tool arguments via configuration, including sample-specific values +- **Consistency**: All nf-core modules follow this pattern +- **Portability**: Modules can be reused without hardcoded tool options +- **No workflow changes**: Adding or changing tool options doesn't require updating workflow code + +!!! note + + The `ext.args` system has powerful additional capabilities not covered here, including switching argument values dynamically based on metadata. See the [nf-core module specifications](https://nf-co.re/docs/guidelines/components/modules) for more details. + +### 1.3. Standardize output naming with `ext.prefix` + +Now that we've given the `cowpy` process access to the metamap, we can start taking advantage of another useful nf-core pattern: naming output files based on metadata. + +Here we're going to use a Nextflow feature called `ext.prefix` that will allow us to standardize output file naming across modules using `meta.id` (the identifier included in the metamap), while still being able to configure modules individually if desired. + +This will be similar to what we did with `ext.args`, with a few differences that we'll detail as we go. + +Let's apply this approach to the `cowpy` module. +We're going to need to make the following changes: + +1. Update the `cowpy` module +2. Configure `ext.prefix` in the `modules.config` file + +(No changes need to the workflow.) + +Once we've done that, we'll run the pipeline to test that everything still works as before. + +#### 1.3.1. Update the `cowpy` module + +Let's do it. +Open the `cowpy.nf` module file (under `core-hello/modules/local/`) and modify it to reference `ext.prefix` as shown below. + +=== "After" + + ```groovy title="core-hello/modules/local/cowpy.nf" linenums="1" hl_lines="15 19 21" + #!/usr/bin/env nextflow + + // Generate ASCII art with cowpy (https://github.com/jeffbuttars/cowpy) + process cowpy { + + publishDir 'results', mode: 'copy' + + container 'community.wave.seqera.io/library/cowpy:1.1.5--3db457ae1977a273' + conda 'conda-forge::cowpy==1.1.5' + + input: + tuple val(meta), path(input_file) + + output: + tuple val(meta), path("${prefix}.txt"), emit: cowpy_output + + script: + def args = task.ext.args ?: '' + prefix = task.ext.prefix ?: "${meta.id}" + """ + cat $input_file | cowpy $args > ${prefix}.txt + """ + } + ``` + +=== "Before" + + ```groovy title="core-hello/modules/local/cowpy.nf" linenums="1" hl_lines="15 20" + #!/usr/bin/env nextflow + + // Generate ASCII art with cowpy (https://github.com/jeffbuttars/cowpy) + process cowpy { + + publishDir 'results', mode: 'copy' + + container 'community.wave.seqera.io/library/cowpy:1.1.5--3db457ae1977a273' + conda 'conda-forge::cowpy==1.1.5' + + input: + tuple val(meta), path(input_file) + + output: + tuple val(meta), path("cowpy-${input_file}"), emit: cowpy_output + + script: + def args = task.ext.args ?: '' + """ + cat $input_file | cowpy $args > cowpy-${input_file} + """ + } + ``` + +You can see we made three changes. + +1. **In the `script:` block, we added the line `prefix = task.ext.prefix ?: "${meta.id}"`.** + That line uses the `?:` operator to determine the value of the `prefix` variable: the content of `task.ext.prefix` if it is not empty, or the identifier from the metamap (`meta.id`) if it is. + Note that while we generally refer to `ext.prefix`, this code must reference `task.ext.prefix` to pull out the module-level `ext.prefix` configuration. + +2. **In the command line, we replaced `cowpy-${input_file}` with `${prefix}.txt`.** + This is where Nextflow will inject the value of `prefix` determined by the line above. + +3. **In the `output:` block, we replaced `path("cowpy-${input_file}")` with `path("${prefix}.txt")`.\*\*** + This simply reiterates what the file path will be according to what is written in the command line. + +As a result, the output file name is now constructed using a sensible default (the identifier from the metamap) combined with the appropriate file format extension. + +#### 1.3.2. Configure `ext.prefix` in the `modules.config` file + +In this case the sensible default is not sufficiently expressive for our taste; we want to use a custom naming pattern that includes the tool name, `cowpy-.txt`, like we had before. + +We'll do that by configuring `ext.prefix` in `modules.config`, just like we did for the `character` parameter with `ext.args`, except this time the `withName: 'cowpy' {}` block already exists, and we just need to add the following line: + +```groovy title="Code to add" +ext.prefix = { "cowpy-${meta.id}" } +``` + +This will compose the string we want. +Note that once again we use curly braces, this time to tell Nextflow to evaluate the value of `meta.id` at runtime. + +Let's add it in. + +Open `conf/modules.config` and add the configuration code inside the `process {}` block as shown below. + +=== "After" + + ```groovy title="core-hello/conf/modules.config" linenums="21" hl_lines="3" + withName: 'cowpy' { + ext.args = { "-c ${params.character}" } + ext.prefix = { "cowpy-${meta.id}" } + } + ``` + +=== "Before" + + ```groovy title="core-hello/conf/modules.config" linenums="21" + withName: 'cowpy' { + ext.args = { "-c ${params.character}" } + } + ``` + +In case you're wondering, the `ext.prefix` closure has access to the correct piece of metadata because the configuration is evaluated in the context of the process execution, where metadata is available. + +#### 1.3.3. Run the pipeline to test it + +Let's test that the workflow still works as expected. + +```bash +nextflow run . --outdir core-hello-results -profile test,docker --validate_params false +``` + + + +Check the outputs: + +```bash +ls results/ +``` + +You should see the cowpy output file with the same naming as before: `cowpy-test.txt`, based on the default batch name. +Feel free to change the `ext.prefix` configuration to satisfy yourself that you can change the naming pattern without having to make any changes to the module or workflow code. + +Alternatively, you can also try running this again with a different `--batch` parameter specified on the command line to satisfy yourself that that part is still customizable on the fly. + +This demonstrates how `ext.prefix` allows you to maintain your preferred naming convention while keeping the module interface flexible. + +To summarize the benefits of this approach: + +- **Standardized naming**: Output files are typically named using sample IDs from metadata +- **Configurable**: Users can override the default naming if needed +- **Consistent**: All nf-core modules follow this pattern +- **Predictable**: Easy to know what output files will be called + +Pretty good, right? +Well, there's one more important change we need to make to improve our module to fit the nf-core guidelines. + +### 1.4. Centralize the publishing configuration + +You may have noticed that we've been publishing outputs to two different directories: + +- **`results`** — The original output directory we've been using from the beginning for our local modules, set individually using per-module `publishDir` directives; +- **`core-hello-results`** — The output directory set with `--outdir` on the command line, which has been receiving the nf-core logs and the results published by `CAT_CAT`. + +This is messy and suboptimal; it would be better to have one location for everything. +Of course, we could go into each of our local modules and update the `publishDir` directive manually to use the `core-hello-results` directory, but what about next time we decide to change the output directory? + +Having individual modules make publishing decisions is clearly not the way to go, especially in a world where the same module might be used in a lot of different pipelines, by people who have different needs or preferences. +We want to be able to control where outputs get published at the level of the workflow configuration. + +"Hey," you might say, "`CAT_CAT` is sending its outputs to the `--outdir`. Maybe we should copy its `publishDir` directive?" + +Yes, that's a great idea. + +Except it doesn't have a `publishDir` directive. (Go ahead, look at the module code.) + +That's because nf-core pipelines centralize control at the workflow level by configuring `publishDir` in `conf/modules.config` rather than in individual modules. +Specifically, the nf-core template declares a default `publishDir` directive (with a predefined directory structure) that applies to all modules unless an overriding directive is provide. + +Doesn't that sound awesome? Could it be that to take advantage of this default directive, all we need to do is remove the current `publishDir` directive from our local modules? + +Let's try that out on `cowpy` to see what happens, then we'll look at the code for the default configuration to understand how it works. + +Finally, we'll demonstrate how to override the default behavior if desired. + +#### 1.4.1. Remove the `publishDir` directive from `cowpy` + +Let's do this. +Open the `cowpy.nf` module file (under `core-hello/modules/local/`) and remove the `publishDir` directive as shown below. + +=== "After" + + ```groovy title="core-hello/modules/local/cowpy.nf (excerpt)" linenums="1" + #!/usr/bin/env nextflow + + // Generate ASCII art with cowpy (https://github.com/jeffbuttars/cowpy) + process cowpy { + + container 'community.wave.seqera.io/library/cowpy:1.1.5--3db457ae1977a273' + conda 'conda-forge::cowpy==1.1.5' + ``` + +=== "Before" + + ```groovy title="core-hello/modules/local/cowpy.nf (excerpt)" linenums="1" hl_lines="6" + #!/usr/bin/env nextflow + + // Generate ASCII art with cowpy (https://github.com/jeffbuttars/cowpy) + process cowpy { + + publishDir 'results', mode: 'copy' + + container 'community.wave.seqera.io/library/cowpy:1.1.5--3db457ae1977a273' + conda 'conda-forge::cowpy==1.1.5' + + ``` + +That's it! + +#### 1.4.2. Run the pipeline to see what happens + +Let's have a look at what happens if we run the pipeline now. + +```bash +nextflow run . --outdir core-hello-results -profile test,docker --validate_params false +``` + + + +Have a look at your current working directory. +Now the `core-hello-results` also contains the outputs of the `cowpy` module. + +```bash +tree core-hello-results/ +``` + + + +You can see that Nextflow created this hierarchy of directories based on the names of the workflow and of the module. + +The code responsible lives in the `conf/modules.config` file. +This is the default `publishDir` configuration that is part of the nf-core template and applies to all processes. + +```groovy +process { + publishDir = [ + path: { "${params.outdir}/${task.process.tokenize(':')[-1].tokenize('_')[0].toLowerCase()}" }, + mode: params.publish_dir_mode, + saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + ] +} +``` + + + +This may look complicated, so let's look at each of the three components: + +- **`path:`** Determines the output directory based on the process name. + The full name of a process contained in `task.process` includes the hierarchy of workflow and module imports (such as `CORE_HELLO:HELLO:CAT_CAT`). + The `tokenize` operations strip away that hierarchy to get just the process name, then take the first part before any underscore (if applicable), and convert it to lowercase. + This is what determines that the results of `CAT_CAT` get published to `${params.outdir}/cat/`. +- **`mode:`** Controls how files are published (copy, symlink, etc.). + This is configurable via the `params.publish_dir_mode` parameter. +- **`saveAs:`** Filters which files to publish. + This example excludes `versions.yml` files by returning `null` for them, preventing them from being published. + +This provides a consistent logic for organizing outputs. + +The output looks even better when all the modules in a pipeline adopt this convention, so feel free to go delete the `publishDir` directives from the other modules in your pipeline. +This default will be applied even to modules that we didn't explicitly modify to follow nf-core guidelines. + +That being said, you may decide you want to organize your inputs differently, and the good news is that it's easy to do so. + +#### 1.4.3. Override the default + +To override the default `publishDir` directive, you can simply add your own directives to the `conf/modules.config` file. + +For example, you could override the default for a single process using the `withName:` selector, as in this example where we add a custom `publishDir` directive for the 'cowpy' process. + +```groovy title="core-hello/conf/modules.config" linenums="13" hl_lines="6-8" +process { + publishDir = [ + path: { "${params.outdir}/${task.process.tokenize(':')[-1].tokenize('_')[0].toLowerCase()}" }, + ] + + withName: 'cowpy' { + ext.args = { "-c ${params.character}" } + publishDir = [ + path: 'my_custom_results' + ] + } +} +``` + +We're not actually going to make that change, but feel free to play with this and see what logic you can implement. + +The point is that this system allows gives you the best of both worlds: consistency by default and the flexibility to customize the configuration on demand. + +To summarize, you get: + +- **Single source of truth**: All publishing configuration lives in `modules.config` +- **Useful default**: Processes work out-of-the-box without per-module configuration +- **Easy customization**: Override publishing behavior in config, not in module code +- **Portable modules**: Modules don't hardcode output locations + +This completes the set of nf-core module features you should absolutely learn to use, but there are others which you can read about in the [nf-core modules specifications](https://nf-co.re/docs/guidelines/components/modules). + +### Takeaway + +You now know how to adapt local modules to follow nf-core conventions: + +- Design your modules to accept and propagate metadata tuples; +- Use `ext.args` to keep module interfaces minimal and portable; +- Use `ext.prefix` for configurable, standardized output file naming; +- Adopt the default centralized `publishDir` directive for a consistent results directory structure. + +### What's next? + +Learn how to use nf-core's built-in template-based tools to create modules the easy way. + +--- + +## 2. Generate modules with nf-core tools + +Now that you've learned the nf-core module patterns by applying them manually, let's look at how you'd create modules in practice. +The nf-core project provides the `nf-core modules create` command that generates properly structured module templates with all these patterns built in from the start. + +### 2.1. Using nf-core modules create + +The `nf-core modules create` command generates a module template that already follows all the conventions you've learned. + +For example, to create the `cowpy` module with a minimal template: + +```bash +nf-core modules create --empty-template cowpy +``` + +The `--empty-template` flag creates a clean starter template without extra code, making it easier to see the essential structure. + +The command runs interactively, guiding you through the setup. +It automatically looks up tool information from package repositories like Bioconda and bio.tools to pre-populate metadata. + +You'll be prompted for several configuration options: + +- **Author information**: Your GitHub username for attribution +- **Resource label**: A predefined set of computational requirements. + The nf-core project provides standard labels like `process_single` for lightweight tools and `process_high` for demanding ones. + These labels help manage resource allocation across different execution environments. +- **Metadata requirement**: Whether the module needs sample-specific information via a `meta` map (usually yes for data processing modules). + +The tool handles the complexity of finding package information and setting up the structure, allowing you to focus on implementing the tool's specific logic. + +### 2.2. What gets generated + +The tool creates a complete module structure in `modules/local/` (or `modules/nf-core/` if you're in the nf-core/modules repository): + +??? example "Directory contents" + + ```console + modules/local/cowpy + ├── environment.yml + ├── main.nf + ├── meta.yml + └── tests + └── main.nf.test + ``` + +Each file serves a specific purpose: + +- **`main.nf`**: Process definition with all the nf-core patterns built in +- **`meta.yml`**: Module documentation describing inputs, outputs, and the tool +- **`environment.yml`**: Conda environment specification for dependencies +- **`tests/main.nf.test`**: nf-test test cases to validate the module works + +!!! tip "Learn more about testing" + + The generated test file uses nf-test, a testing framework for Nextflow pipelines and modules. To learn how to write and run these tests, see the [nf-test side quest](../../side_quests/nf_test/). + +The generated `main.nf` includes all the patterns you just learned, plus some additional features: + +```groovy title="modules/local/cowpy/main.nf" hl_lines="11 21 22" +process COWPY { + tag "$meta.id" + label 'process_single' + + conda "${moduleDir}/environment.yml" + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/YOUR-TOOL-HERE': + 'biocontainers/YOUR-TOOL-HERE' }" + + input: + tuple val(meta), path(input) // Pattern 1: Metadata tuples ✓ + + output: + tuple val(meta), path("*"), emit: output + path "versions.yml" , emit: versions + + when: + task.ext.when == null || task.ext.when + + script: + def args = task.ext.args ?: '' // Pattern 2: ext.args ✓ + def prefix = task.ext.prefix ?: "${meta.id}" // Pattern 3: ext.prefix ✓ + + """ + // Add your tool command here + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + cowpy: \$(cowpy --version) + END_VERSIONS + """ + + stub: + def args = task.ext.args ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" + + """ + echo $args + touch ${prefix}.txt + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + cowpy: \$(cowpy --version) + END_VERSIONS + """ +} +``` + +Notice how all the patterns you applied manually above are already there! +The template also includes several additional nf-core conventions. +Some of these work out of the box, while others are placeholders we'll need to fill in, as described below. + +**Features that work as-is:** + +- **`tag "$meta.id"`**: Adds sample ID to process names in logs for easier tracking +- **`label 'process_single'`**: Resource label for configuring CPU/memory requirements +- **`when:` block**: Allows conditional execution via `task.ext.when` configuration + +These features are already functional and make modules more maintainable. + +**Placeholders we'll customize below:** + +- **`input:` and `output:` blocks**: Generic declarations we'll update to match our tool +- **`script:` block**: Contains a comment where we'll add the cowpy command +- **`stub:` block**: Template we'll update to produce the correct outputs +- **Container and environment**: Placeholders we'll fill with package information + +The next sections walk through completing these customizations. + +### 2.3. Completing the environment and container setup + +In the case of cowpy, the tool warned that it couldn't find the package in Bioconda (the primary channel for bioinformatics tools). +However, cowpy is available in conda-forge, so you would complete the `environment.yml` like this: + +```yaml title="modules/local/cowpy/environment.yml" +name: cowpy +channels: + - conda-forge +dependencies: + - cowpy=1.1.5 +``` + +For the container, you can use [Seqera Containers](https://seqera.io/containers/) to automatically build a container from any Conda package, including conda-forge packages: + +```groovy +container "community.wave.seqera.io/library/cowpy:1.1.5--3db457ae1977a273" +``` + +!!! tip "Bioconda vs conda-forge packages" + + - **Bioconda packages**: Automatically get BioContainers built, providing ready-to-use containers + - **conda-forge packages**: Can use Seqera Containers to build containers on-demand from the Conda recipe + + Most bioinformatics tools are in Bioconda, but for conda-forge tools, Seqera Containers provides an easy solution for containerization. + +### 2.4. Defining inputs and outputs + +The generated template includes generic input and output declarations that you'll need to customize for your specific tool. +Looking back at our manual `cowpy` module from section 1, we can use that as a guide. + +Update the input and output blocks: + +=== "After" + + ```groovy title="modules/local/cowpy/main.nf" linenums="8" hl_lines="2 5" + input: + tuple val(meta), path(input_file) + + output: + tuple val(meta), path("${prefix}.txt"), emit: cowpy_output + path "versions.yml" , emit: versions + ``` + +=== "Before" + + ```groovy title="modules/local/cowpy/main.nf" linenums="8" hl_lines="2 5" + input: + tuple val(meta), path(input) + + output: + tuple val(meta), path("*"), emit: output + path "versions.yml" , emit: versions + ``` + +This specifies: + +- The input file parameter name (`input_file` instead of generic `input`) +- The output filename using the configurable prefix pattern (`${prefix}.txt` instead of wildcard `*`) +- A descriptive emit name (`cowpy_output` instead of generic `output`) + +### 2.5. Writing the script block + +The template provides a comment placeholder where you add the actual tool command. +We can reference our manual module from earlier for the command logic: + +=== "After" + + ```groovy title="modules/local/cowpy/main.nf" linenums="15" hl_lines="3 6" + script: + def args = task.ext.args ?: '' + prefix = task.ext.prefix ?: "${meta.id}" + + """ + cat $input_file | cowpy $args > ${prefix}.txt + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + cowpy: \$(cowpy --version) + END_VERSIONS + """ + ``` + +=== "Before" + + ```groovy title="modules/local/cowpy/main.nf" linenums="15" hl_lines="6" + script: + def args = task.ext.args ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" + + """ + // Add your tool command here + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + cowpy: \$(cowpy --version) + END_VERSIONS + """ + ``` + +Key changes: + +- Change `def prefix` to just `prefix` (without `def`) so it's accessible in the output block +- Replace the comment with the actual cowpy command that uses both `$args` and `${prefix}.txt` + +### 2.6. Implementing the stub block + +The stub block provides a fast mock implementation for testing pipeline logic without running the actual tool. +It must produce the same output files as the script block: + +=== "After" + + ```groovy title="modules/local/cowpy/main.nf" linenums="27" hl_lines="3 6" + stub: + def args = task.ext.args ?: '' + prefix = task.ext.prefix ?: "${meta.id}" + + """ + touch ${prefix}.txt + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + cowpy: \$(cowpy --version) + END_VERSIONS + """ + ``` + +=== "Before" + + ```groovy title="modules/local/cowpy/main.nf" linenums="27" hl_lines="5-6" + stub: + def args = task.ext.args ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" + + """ + echo $args + touch ${prefix}.txt + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + cowpy: \$(cowpy --version) + END_VERSIONS + """ + ``` + +Key changes: + +- Change `def prefix` to just `prefix` to match the script block +- Remove the `echo $args` line (which was just template placeholder code) +- The stub creates an empty `${prefix}.txt` file matching what the script block produces + +This allows you to test workflow logic and file handling without waiting for the actual tool to run. + +Once you've completed the environment setup (section 2.1.2), inputs/outputs (section 2.1.3), script block (section 2.1.4), and stub block (section 2.1.5), the module is ready to test! + +### Takeaway + +You now know how to use the built-in nf-core tooling to create modules efficiently using templates rather than writing everything from scratch. + +### What's next? + +Learn what are the benefits of contributing modules to nf-core and what are the main steps and requirements involved. + +--- + +## 3. Contributing modules back to nf-core + +The [nf-core/modules](https://github.com/nf-core/modules) repository welcomes contributions of well-tested, standardized modules. + +### 3.1. Why contribute? + +Contributing your modules to nf-core: + +- Makes your tools available to the entire nf-core community through the modules catalog at [nf-co.re/modules](https://nf-co.re/modules) +- Ensures ongoing community maintenance and improvements +- Provides quality assurance through code review and automated testing +- Gives your work visibility and recognition + +### 3.2. Contributor's checklist + +To contribute a module to nf-core, you will need to go through the following steps: + +1. Check if it already exists at [nf-co.re/modules](https://nf-co.re/modules) +2. Fork the [nf-core/modules](https://github.com/nf-core/modules) repository +3. Use `nf-core modules create` to generate the template +4. Fill in the module logic and tests +5. Test with `nf-core modules test tool/subtool` +6. Lint with `nf-core modules lint tool/subtool` +7. Submit a pull request + +For detailed instructions, see the [nf-core components tutorial](https://nf-co.re/docs/tutorials/nf-core_components/components). + +### 3.3. Resources + +- **Components tutorial**: [Complete guide to creating and contributing modules](https://nf-co.re/docs/tutorials/nf-core_components/components) +- **Module specifications**: [Technical requirements and guidelines](https://nf-co.re/docs/guidelines/components/modules) +- **Community support**: [nf-core Slack](https://nf-co.re/join) - Join the `#modules` channel + +## Takeaway + +You now know how to create nf-core modules! You learned the four key patterns that make modules portable and maintainable: + +- **Metadata tuples** propagate metadata through the workflow +- **`ext.args`** simplifies module interfaces by handling optional arguments via configuration +- **`ext.prefix`** standardizes output file naming +- **Centralized publishing** via `publishDir` configured in `modules.config` rather than hardcoded in modules + +By transforming `cowpy` step-by-step, you developed a deep understanding of these patterns, making you equipped to work with, debug, and create nf-core modules. +In practice, you'll use `nf-core modules create` to generate properly structured modules with these patterns built in from the start. + +Finally, you learned how to contribute modules to the nf-core community, making tools available to researchers worldwide while benefiting from ongoing community maintenance. + +## What's next? + +When you're ready, continue to [Part 5: Input validation](./05_input_validation.md) to learn how to add schema-based input validation to your pipeline. diff --git a/docs/hello_nf-core/05_input_validation.md b/docs/hello_nf-core/05_input_validation.md index 248095e139..787c0a673f 100644 --- a/docs/hello_nf-core/05_input_validation.md +++ b/docs/hello_nf-core/05_input_validation.md @@ -32,69 +32,75 @@ Pipeline failed before execution - please fix the errors above The pipeline fails immediately with clear, actionable error messages. This saves time, compute resources, and frustration. -## Two types of validation +## The nf-schema plugin -nf-core pipelines validate two different kinds of input: - -1. **Parameter validation**: Validates command-line parameters (flags like `--outdir`, `--batch`, `--input`) +The [nf-schema plugin](https://nextflow-io.github.io/nf-schema/latest/) is a Nextflow plugin that provides comprehensive validation capabilities for Nextflow pipelines. +While nf-schema works with any Nextflow workflow, it's the standard validation solution for all nf-core pipelines. - - Checks parameter types, ranges, and formats - - Ensures required parameters are provided - - Validates file paths exist - - Defined in `nextflow_schema.json` +nf-schema provides several key functions: -2. **Input data validation**: Validates the contents of input files (like sample sheets or CSV files) +- **Parameter validation**: Validates pipeline parameters against `nextflow_schema.json` +- **Sample sheet validation**: Validates input files against `assets/schema_input.json` +- **Channel conversion**: Converts validated sample sheets to Nextflow channels +- **Help text generation**: Automatically generates `--help` output from schema definitions +- **Parameter summary**: Displays which parameters differ from defaults - - Checks column structure and data types - - Validates file references within the input file - - Ensures required fields are present - - Defined in `assets/schema_input.json` +nf-schema is the successor to the deprecated nf-validation plugin and uses standard [JSON Schema Draft 2020-12](https://json-schema.org/) for validation. -Both types of validation happen **before** the pipeline executes any processes, ensuring fast failure with clear error messages. +!!! note "What are Nextflow plugins?" -!!! note + Plugins are extensions that add new functionality to the Nextflow language itself. They're installed via a `plugins{}` block in `nextflow.config` and can provide: - This section assumes you have completed [Part 4: Adapt local modules to nf-core conventions](./04_adapt_module.md) and have a working `core-hello` pipeline with adapted nf-core modules. + - New functions and classes that can be imported (like `samplesheetToList`) + - New DSL features and operators + - Integration with external services - If you didn't complete Part 4 or want to start fresh for this section, you can use the `core-hello-part4` solution as your starting point: + The nf-schema plugin is specified in `nextflow.config`: - ```bash - cp -r hello-nf-core/solutions/core-hello-part4 core-hello - cd core-hello + ```groovy + plugins { + id 'nf-schema@2.1.1' + } ``` - This gives you a fully functional nf-core pipeline with modules ready for adding input validation. + Once installed, you can import functions from plugins using `include { functionName } from 'plugin/plugin-name'` syntax. ---- +## Two schema files -## 1. The nf-schema plugin +An nf-core pipeline uses two schema files for validation: -The [nf-schema plugin](https://nextflow-io.github.io/nf-schema/latest/) is a Nextflow plugin that provides comprehensive validation capabilities for nf-core pipelines. +| Schema File | Purpose | Validates | +| -------------------------- | --------------------- | ---------------------------------------------------- | +| `nextflow_schema.json` | Parameter validation | Command-line flags: `--input`, `--outdir`, `--batch` | +| `assets/schema_input.json` | Input data validation | Contents of sample sheets and input files | -### 1.1. Core functionality +Both schemas use JSON Schema format, a widely-adopted standard for describing and validating data structures. -nf-schema provides several key functions: +### Two types of validation -- **Parameter validation**: Validates pipeline parameters against `nextflow_schema.json` -- **Sample sheet validation**: Validates input files against `assets/schema_input.json` -- **Channel conversion**: Converts validated sample sheets to Nextflow channels -- **Help text generation**: Automatically generates `--help` output from schema definitions -- **Parameter summary**: Displays which parameters differ from defaults +nf-core pipelines validate two different kinds of input: -nf-schema is the successor to the deprecated nf-validation plugin and uses standard [JSON Schema Draft 2020-12](https://json-schema.org/) for validation. +**Parameter validation** validates command-line parameters (flags like `--outdir`, `--batch`, `--input`): -### 1.2. The two schema files +- Checks parameter types, ranges, and formats +- Ensures required parameters are provided +- Validates file paths exist +- Defined in `nextflow_schema.json` -An nf-core pipeline uses two schema files for validation: +**Input data validation** validates the structure of sample sheets and manifest files (CSV/TSV files that describe your data): -| Schema File | Purpose | Validates | -| -------------------------- | --------------------- | ---------------------------------------------------- | -| `nextflow_schema.json` | Parameter validation | Command-line flags: `--input`, `--outdir`, `--batch` | -| `assets/schema_input.json` | Input data validation | Contents of sample sheets and input files | +- Checks column structure and data types +- Validates that file paths referenced in the sample sheet exist +- Ensures required fields are present +- Defined in `assets/schema_input.json` -Both schemas use JSON Schema format, a widely-adopted standard for describing and validating data structures. +!!! note "What input data validation does NOT do" -### 1.3. When validation occurs + Input data validation checks the structure of *manifest files* (sample sheets, CSV files), not the contents of your actual data files (FASTQ, BAM, VCF, etc.). + + For large-scale data, validating file contents (like checking BAM integrity) should happen in pipeline processes running on worker nodes, not during the validation stage on the orchestrating machine. + +### When validation occurs ```mermaid graph LR @@ -107,54 +113,107 @@ graph LR Validation happens **before** any pipeline processes run, providing fast feedback and preventing wasted compute time. -### Takeaway +!!! note -You now understand what nf-schema does, the two types of validation it provides, and when validation occurs in the pipeline execution lifecycle. + This section assumes you have completed [Part 4: Make an nf-core module](./04_make_module.md) and have a working `core-hello` pipeline with nf-core-style modules. -### What's next? + If you didn't complete Part 4 or want to start fresh for this section, you can use the `core-hello-part4` solution as your starting point: + + ```bash + cp -r hello-nf-core/solutions/core-hello-part4 core-hello + cd core-hello + ``` -Start by implementing parameter validation for command-line flags. + This gives you a fully functional nf-core pipeline with modules ready for adding input validation. --- -## 2. Parameter validation (nextflow_schema.json) +## 1. Parameter validation (nextflow_schema.json) Let's start by adding parameter validation to our pipeline. This validates command-line flags like `--input`, `--outdir`, and `--batch`. -### 2.1. Examine the parameter schema +### 1.1. Configure validation to skip input file validation + +The nf-core pipeline template comes with nf-schema already installed and configured: + +- The nf-schema plugin is installed via the `plugins{}` block in `nextflow.config` +- Parameter validation is enabled by default via `params.validate_params = true` +- The validation is performed by the `UTILS_NFSCHEMA_PLUGIN` subworkflow during pipeline initialization + +The validation behavior is controlled through the `validation{}` scope in `nextflow.config`. + +Since we'll be working on parameter validation first (this section) and won't configure the input data schema until section 2, we need to temporarily tell nf-schema to skip validating the `input` parameter's file contents. + +Open `nextflow.config` and find the `validation` block (around line 246). Add `ignoreParams` to skip input file validation: + +=== "After" + + ```groovy title="nextflow.config" hl_lines="3" linenums="246" + validation { + defaultIgnoreParams = ["genomes"] + ignoreParams = ['input'] + monochromeLogs = params.monochrome_logs + } + ``` + +=== "Before" + + ```groovy title="nextflow.config" linenums="246" + validation { + defaultIgnoreParams = ["genomes"] + monochromeLogs = params.monochrome_logs + } + ``` + +This configuration tells nf-schema to: + +- **`defaultIgnoreParams`**: Skip validation of complex parameters like `genomes` (set by template developers) +- **`ignoreParams`**: Skip validation of the `input` parameter's file contents (temporary - we'll remove this in section 2) +- **`monochromeLogs`**: Disable colored output in validation messages when set to `true` (controlled by `params.monochrome_logs`) + +!!! note "Why ignore the input parameter?" + + The `input` parameter in `nextflow_schema.json` has `"schema": "assets/schema_input.json"` which tells nf-schema to validate the *contents* of the input CSV file against that schema. + Since we haven't configured that schema yet, we temporarily ignore this validation. + We'll remove this setting in section 2 after configuring the input data schema. + +### 1.2. Examine the parameter schema Let's look at a section of the `nextflow_schema.json` file that came with our pipeline template: ```bash -grep -A 20 '"input_output_options"' nextflow_schema.json +grep -A 25 '"input_output_options"' nextflow_schema.json ``` -The parameter schema is organized into groups. Here's the `input_output_options` group (simplified): - -```json title="core-hello/nextflow_schema.json (excerpt)" -"input_output_options": { - "title": "Input/output options", - "type": "object", - "description": "Define where the pipeline should find input data and save output data.", - "required": ["input", "outdir"], - "properties": { - "input": { - "type": "string", - "format": "file-path", - "exists": true, - "mimetype": "text/csv", - "pattern": "^\\S+\\.csv$", - "description": "Path to comma-separated file containing greetings.", - "help_text": "You will need to create a design file with information about the samples in your experiment before running the pipeline." +The parameter schema is organized into groups. Here's the `input_output_options` group: + +```json title="core-hello/nextflow_schema.json (excerpt)" linenums="8" + "input_output_options": { + "title": "Input/output options", + "type": "object", + "fa_icon": "fas fa-terminal", + "description": "Define where the pipeline should find input data and save output data.", + "required": ["input", "outdir"], + "properties": { + "input": { + "type": "string", + "format": "file-path", + "exists": true, + "schema": "assets/schema_input.json", + "mimetype": "text/csv", + "pattern": "^\\S+\\.csv$", + "description": "Path to comma-separated file containing information about the samples in the experiment.", + "help_text": "You will need to create a design file with information about the samples in your experiment before running the pipeline. Use this parameter to specify its location. It has to be a comma-separated file with 3 columns, and a header row.", + "fa_icon": "fas fa-file-csv" + }, + "outdir": { + "type": "string", + "format": "directory-path", + "description": "The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure.", + "fa_icon": "fas fa-folder-open" + } + } }, - "outdir": { - "type": "string", - "format": "directory-path", - "description": "The output directory where the results will be saved.", - "fa_icon": "fas fa-folder-open" - } - } -} ``` Key validation features: @@ -174,62 +233,99 @@ Key validation features: Notice the `batch` parameter we've been using isn't defined yet in the schema! -### 2.2. Add the batch parameter +### 1.3. Add the batch parameter -The parameter schema can be edited manually, but nf-core provides a helpful GUI tool: +While the schema is a JSON file that can be edited manually, **manual editing is error-prone and not recommended**. +Instead, nf-core provides an interactive GUI tool that handles the JSON Schema syntax for you and validates your changes: ```bash nf-core pipelines schema build ``` -This command launches an interactive web interface where you can: +You'll see output like: -- Add new parameters -- Set validation rules -- Organize parameters into groups -- Generate help text +```console + ,--./,-. + ___ __ __ __ ___ /,-._.--\ +|\ | |__ __ / ` / \ |__) |__ } { +| \| | \__, \__/ | \ |___ \`-._,-`-, + `._,._,' -!!! warning "Schema validation errors" +nf-core/tools version 3.4.1 - https://nf-co.re - If you run `nf-core pipelines schema build` at this stage, you may see an error like: +INFO [✓] Default parameters match schema validation +INFO [✓] Pipeline schema looks valid (found 17 params) +INFO Writing schema with 17 params: 'nextflow_schema.json' +🚀 Launch web builder for customisation and editing? [y/n]: +``` - ``` - [✗] Invalid default parameters found: - input: Not in pipeline parameters. Check `nextflow.config`. - ``` +Type `y` and press Enter to launch the interactive web interface. + +Your browser will open showing the Parameter schema builder: - This happens because the template's schema includes an `input` parameter, but it's not yet defined in `nextflow.config`. You can safely ignore this for now - we're using the `--input` parameter as a command-line argument rather than setting a default in the config. +![Schema builder interface](./img/schema_build.png) -For our simple case, we'll edit the JSON directly. Open `core-hello/nextflow_schema.json` and find the `"input_output_options"` section. Add the `batch` parameter: +To add the `batch` parameter: -```json title="core-hello/nextflow_schema.json (excerpt)" hl_lines="13-17" -"input_output_options": { - "title": "Input/output options", - "type": "object", - "description": "Define where the pipeline should find input data and save output data.", - "required": ["input", "outdir"], - "properties": { +1. Click the **"Add parameter"** button at the top +2. Use the drag handle (⋮⋮) to move the new parameter up into the "Input/output options" group, below the `input` parameter +3. Fill in the parameter details: + - **ID**: `batch` + - **Description**: `Name for this batch of greetings` + - **Type**: `string` + - Check the **Required** checkbox + - Optionally, select an icon from the icon picker (e.g., `fas fa-layer-group`) + +![Adding the batch parameter](./img/schema_add.png) + +When you're done, click the **"Finished"** button at the top right. + +Back in your terminal, you'll see: + +```console +INFO Writing schema with 18 params: 'nextflow_schema.json' +⣾ Use ctrl+c to stop waiting and force exit. +``` + +Press `Ctrl+C` to exit the schema builder. + +The tool has now updated your `nextflow_schema.json` file with the new `batch` parameter, handling all the JSON Schema syntax correctly. + +### 1.4. Verify the changes + +```bash +grep -A 25 '"input_output_options"' nextflow_schema.json +``` + +```json title="core-hello/nextflow_schema.json (excerpt)" linenums="8" hl_lines="19-23" + "input_output_options": { + "title": "Input/output options", + "type": "object", + "fa_icon": "fas fa-terminal", + "description": "Define where the pipeline should find input data and save output data.", + "required": ["input", "outdir", "batch"], + "properties": { "input": { - "type": "string", - "format": "file-path", - "exists": true, - "description": "Path to comma-separated file containing greetings." + "type": "string", + "format": "file-path", + "exists": true, + "schema": "assets/schema_input.json", + "mimetype": "text/csv", + "pattern": "^\\S+\\.csv$", + "description": "Path to comma-separated file containing information about the samples in the experiment.", + "help_text": "You will need to create a design file with information about the samples in your experiment before running the pipeline. Use this parameter to specify its location. It has to be a comma-separated file with 3 columns, and a header row.", + "fa_icon": "fas fa-file-csv" }, "batch": { - "type": "string", - "default": "batch-01", - "description": "Name for this batch of greetings" + "type": "string", + "description": "Name for this batch of greetings", + "fa_icon": "fas fa-layer-group" }, - "outdir": { - "type": "string", - "format": "directory-path", - "description": "The output directory where the results will be saved." - } - } -} ``` -### 2.3. Test parameter validation +You should see that the `batch` parameter has been added to the schema with the "required" field now showing `["input", "outdir", "batch"]`. + +### 1.5. Test parameter validation Now let's test that parameter validation works correctly. @@ -243,7 +339,9 @@ nextflow run . --outdir test-results -profile docker ERROR ~ Validation of pipeline parameters failed! -- Check '.nextflow.log' file for details - * --input: required property is missing +The following invalid input values have been detected: + +* Missing required parameter(s): input, batch ``` Perfect! The validation catches the missing required parameter before the pipeline runs. @@ -251,16 +349,15 @@ Perfect! The validation catches the missing required parameter before the pipeli Now try with a valid set of parameters: ```bash -nextflow run . --input assets/greetings.csv --outdir results --batch my-batch -profile test,docker --validationSchemaIgnoreParams input +nextflow run . --input assets/greetings.csv --outdir results --batch my-batch -profile test,docker ``` -Note: We use `--validationSchemaIgnoreParams input` to skip input data validation at this stage since we haven't configured the input schema yet (we'll do that in the next section). - The pipeline should run successfully, and the `batch` parameter is now validated. ### Takeaway -You now know how to add parameters to `nextflow_schema.json` and test parameter validation. The nf-core schema build tool makes it easy to manage complex parameter schemas interactively. +You've learned how to use the interactive `nf-core pipelines schema build` tool to add parameters to `nextflow_schema.json` and seen parameter validation in action. +The web interface handles all the JSON Schema syntax for you, making it easy to manage complex parameter schemas without error-prone manual JSON editing. ### What's next? @@ -268,11 +365,11 @@ Now that parameter validation is working, let's add validation for the input dat --- -## 3. Input data validation (schema_input.json) +## 2. Input data validation (schema_input.json) Now let's add validation for the contents of our input CSV file. While parameter validation checks command-line flags, input data validation ensures the data inside the CSV file is structured correctly. -### 3.1. Understand the greetings.csv format +### 2.1. Understand the greetings.csv format Let's remind ourselves what our input looks like: @@ -292,7 +389,7 @@ This is a simple CSV with: - One greeting per line - Text strings with no special format requirements -### 3.2. Design the schema structure +### 2.2. Design the schema structure For our use case, we want to: @@ -303,42 +400,84 @@ For our use case, we want to: We'll structure this as an array of objects, where each object has a `greeting` field. -### 3.3. Create the schema file - -Replace the contents of `assets/schema_input.json` with the following: - -```json title="assets/schema_input.json" linenums="1" -{ - "$schema": "https://json-schema.org/draft/2020-12/schema", - "$id": "https://raw.githubusercontent.com/core/hello/main/assets/schema_input.json", - "title": "core/hello pipeline - params.input schema", - "description": "Schema for the greetings file provided with params.input", - "type": "array", - "items": { - "type": "object", - "properties": { - "greeting": { - "type": "string", - "pattern": "^\\S.*$", - "errorMessage": "Greeting must be provided and cannot be empty or start with whitespace" - } - }, - "required": ["greeting"] - } -} -``` +### 2.3. Update the schema file + +The nf-core pipeline template includes a default `assets/schema_input.json` designed for paired-end sequencing data. +We need to replace it with a simpler schema for our greetings use case. + +Open `assets/schema_input.json` and replace the `properties` and `required` sections: + +=== "After" + + ```json title="assets/schema_input.json" linenums="1" hl_lines="10-14 16" + { + "$schema": "https://json-schema.org/draft/2020-12/schema", + "$id": "https://raw.githubusercontent.com/core/hello/main/assets/schema_input.json", + "title": "core/hello pipeline - params.input schema", + "description": "Schema for the greetings file provided with params.input", + "type": "array", + "items": { + "type": "object", + "properties": { + "greeting": { + "type": "string", + "pattern": "^\\S.*$", + "errorMessage": "Greeting must be provided and cannot be empty or start with whitespace" + } + }, + "required": ["greeting"] + } + } + ``` + +=== "Before" + + ```json title="assets/schema_input.json" linenums="1" hl_lines="10-29 31" + { + "$schema": "https://json-schema.org/draft/2020-12/schema", + "$id": "https://raw.githubusercontent.com/core/hello/main/assets/schema_input.json", + "title": "core/hello pipeline - params.input schema", + "description": "Schema for the file provided with params.input", + "type": "array", + "items": { + "type": "object", + "properties": { + "sample": { + "type": "string", + "pattern": "^\\S+$", + "errorMessage": "Sample name must be provided and cannot contain spaces", + "meta": ["id"] + }, + "fastq_1": { + "type": "string", + "format": "file-path", + "exists": true, + "pattern": "^([\\S\\s]*\\/)?[^\\s\\/]+\\.f(ast)?q\\.gz$", + "errorMessage": "FastQ file for reads 1 must be provided, cannot contain spaces and must have extension '.fq.gz' or '.fastq.gz'" + }, + "fastq_2": { + "type": "string", + "format": "file-path", + "exists": true, + "pattern": "^([\\S\\s]*\\/)?[^\\s\\/]+\\.f(ast)?q\\.gz$", + "errorMessage": "FastQ file for reads 2 cannot contain spaces and must have extension '.fq.gz' or '.fastq.gz'" + } + }, + "required": ["sample", "fastq_1"] + } + } + ``` -Let's break down the key parts: +The key changes: -- **`type: "array"`**: The input is parsed as an array (list) of items -- **`items.type: "object"`**: Each item in the array is an object -- **`properties.greeting`**: Defines a field called `greeting` +- **`description`**: Updated to mention "greetings file" +- **`properties`**: Replaced `sample`, `fastq_1`, and `fastq_2` with a single `greeting` field - **`type: "string"`**: Must be a text string - **`pattern: "^\\S.*$"`**: Must start with a non-whitespace character (but can contain spaces after that) - **`errorMessage`**: Custom error message shown if validation fails -- **`required: ["greeting"]`**: The `greeting` field is mandatory +- **`required`**: Changed from `["sample", "fastq_1"]` to `["greeting"]` -### 3.4. Add a header to the greetings.csv file +### 2.4. Add a header to the greetings.csv file When nf-schema reads a CSV file, it expects the first row to contain column headers that match the field names in the schema. @@ -348,7 +487,7 @@ Add a header line to the greetings file: === "After" - ```csv title="assets/greetings.csv" linenums="1" + ```csv title="assets/greetings.csv" linenums="1" hl_lines="1" greeting Hello Bonjour @@ -365,15 +504,9 @@ Add a header line to the greetings file: Now the CSV file has a header that matches the field name in our schema. -### Takeaway - -You've created a JSON schema for the greetings input file and added the required header to the CSV file. - -### What's next? - -Implement the validation in the pipeline code using `samplesheetToList`. +The final step is to implement the validation in the pipeline code using `samplesheetToList`. -### 3.5. Implement samplesheetToList in the pipeline +### 2.5. Implement `samplesheetToList` in the pipeline Now we need to replace our simple CSV parsing with nf-schema's `samplesheetToList` function, which validates and converts the sample sheet. @@ -386,7 +519,7 @@ The `samplesheetToList` function: Let's update the input handling code: -Open [core-hello/subworkflows/local/utils_nfcore_hello_pipeline/main.nf](core-hello/subworkflows/local/utils_nfcore_hello_pipeline/main.nf) and locate the section where we create the input channel (around line 64). +Open `subworkflows/local/utils_nfcore_hello_pipeline/main.nf` and locate the section where we create the input channel (around line 80). We need to: @@ -394,24 +527,6 @@ We need to: 2. Validate and parse the input 3. Extract just the greeting strings for our workflow -!!! note "What are Nextflow plugins?" - - Plugins are extensions that add new functionality to the Nextflow language itself. They're installed via a `plugins{}` block in `nextflow.config` and can provide: - - - New functions and classes that can be imported (like `samplesheetToList`) - - New DSL features and operators - - Integration with external services - - The nf-schema plugin is specified in `nextflow.config`: - - ```groovy - plugins { - id 'nf-schema@2.1.1' - } - ``` - - Once installed, you can import functions from plugins using `include { functionName } from 'plugin/plugin-name'` syntax. - First, note that the `samplesheetToList` function is already imported at the top of the file (the nf-core template includes this by default): ```groovy title="core-hello/subworkflows/local/utils_nfcore_hello_pipeline/main.nf" linenums="1" hl_lines="13" @@ -438,15 +553,12 @@ Now update the channel creation code: === "After" - ```groovy title="core-hello/subworkflows/local/utils_nfcore_hello_pipeline/main.nf" linenums="64" hl_lines="4-8" + ```groovy title="core-hello/subworkflows/local/utils_nfcore_hello_pipeline/main.nf" linenums="80" hl_lines="4" // // Create channel from input file provided through params.input // - ch_samplesheet = Channel.fromList(samplesheetToList(params.input, "${projectDir}/assets/schema_input.json")) - .map { row -> - // Extract just the greeting string from each row - row[0] - } + ch_samplesheet = channel.fromList(samplesheetToList(params.input, "${projectDir}/assets/schema_input.json")) + .map { line -> line[0] } emit: samplesheet = ch_samplesheet @@ -455,13 +567,13 @@ Now update the channel creation code: === "Before" - ```groovy title="core-hello/subworkflows/local/utils_nfcore_hello_pipeline/main.nf" linenums="64" + ```groovy title="core-hello/subworkflows/local/utils_nfcore_hello_pipeline/main.nf" linenums="80" hl_lines="4 5" // // Create channel from input file provided through params.input // - ch_samplesheet = Channel.fromPath(params.input) - .splitCsv() - .map { line -> line[0] } + ch_samplesheet = channel.fromPath(params.input) + .splitCsv() + .map { line -> line[0] } emit: samplesheet = ch_samplesheet @@ -472,66 +584,68 @@ Let's break down what changed: 1. **`samplesheetToList(params.input, "${projectDir}/assets/schema_input.json")`**: Validates the input file against our schema and returns a list 2. **`Channel.fromList(...)`**: Converts the list into a Nextflow channel -3. **`.map { row -> row[0] }`**: Extracts just the greeting string from each validated row (accessing the first column by index) -!!! note "Parameter validation is enabled by default" +This completes the implementation of input data validation using `samplesheetToList` and JSON schemas. - The nf-schema plugin is installed via the `plugins{}` block in `nextflow.config`, and the pipeline template already includes parameter validation enabled via `params.validate_params = true`. The validation is performed by the `UTILS_NFSCHEMA_PLUGIN` subworkflow during pipeline initialization. +Now that we've configured the input data schema, we can remove the temporary ignore setting we added earlier. -### Takeaway +### 2.6. Re-enable input validation -You've successfully implemented input data validation using `samplesheetToList` and JSON schemas. +Open `nextflow.config` and remove the `ignoreParams` line from the `validation` block: -### What's next? +=== "After" -Test both parameter and input data validation to see them in action. + ```groovy title="nextflow.config" linenums="246" + validation { + defaultIgnoreParams = ["genomes"] + monochromeLogs = params.monochrome_logs + } + ``` -### 3.6. Test input validation +=== "Before" + + ```groovy title="nextflow.config" hl_lines="3" linenums="246" + validation { + defaultIgnoreParams = ["genomes"] + ignoreParams = ['input'] + monochromeLogs = params.monochrome_logs + } + ``` + +Now nf-schema will validate both parameter types AND the input file contents. + +### 2.7. Test input validation Let's verify that our validation works by testing both valid and invalid inputs. -**Test with valid input:** +#### 2.7.1. Test with valid input First, confirm the pipeline runs successfully with valid input: ```bash -nextflow run core-hello --outdir core-hello-results -profile test,docker +nextflow run . --outdir core-hello-results -profile test,docker ``` Note that we no longer need `--validate_params false` since validation is working! ```console title="Output" - N E X T F L O W ~ version 25.04.3 - -Launching `./main.nf` [nasty_kalman] DSL2 - revision: c31b966b36 - -Input/output options - input : /private/tmp/core-hello-test/assets/greetings.csv - batch : test - outdir : core-hello-results +------------------------------------------------------ +WARN: The following invalid input values have been detected: -Institutional config options - config_profile_name : Test profile - config_profile_description: Minimal test dataset to check pipeline function +* --character: tux -Core Nextflow options - runName : nasty_kalman - containerEngine : docker - profile : test,docker -!! Only displaying parameters that differ from the pipeline defaults !! ------------------------------------------------------- -executor > local (7) -[cc/cc800d] CORE_HELLO:HELLO:sayHello (1) | 3 of 3 ✔ -[d6/46ab71] CORE_HELLO:HELLO:convertToUpper (1) | 3 of 3 ✔ -[b2/3def99] CORE_HELLO:HELLO:CAT_CAT (test) | 1 of 1 ✔ -[a3/f82e41] CORE_HELLO:HELLO:cowpy | 1 of 1 ✔ +executor > local (10) +[c1/39f64a] CORE_HELLO:HELLO:sayHello (1) | 4 of 4 ✔ +[44/c3fb82] CORE_HELLO:HELLO:convertToUpper (4) | 4 of 4 ✔ +[62/80fab2] CORE_HELLO:HELLO:CAT_CAT (test) | 1 of 1 ✔ +[e1/4db4fd] CORE_HELLO:HELLO:cowpy | 1 of 1 ✔ -[core/hello] Pipeline completed successfully- ``` -Great! The pipeline runs successfully and validation passes silently. +Great! The pipeline runs successfully and validation passes silently. The warning about `--character` is just informational since it's not defined in the schema. If you want, use what you've learned to add validation for that parameter too! -**Test with invalid input:** +#### 2.7.2. Test with invalid input Now let's test that validation catches errors. Create a test file with an invalid column name: @@ -549,34 +663,22 @@ This file uses `message` as the column name instead of `greeting`, which doesn't Try running the pipeline with this invalid input: ```bash -nextflow run core-hello --input /tmp/invalid_greetings.csv --outdir test-results -profile docker +nextflow run . --input /tmp/invalid_greetings.csv --outdir test-results -profile docker ``` ```console title="Output" - N E X T F L O W ~ version 25.04.3 - -Launching `./main.nf` [stupefied_poincare] DSL2 - revision: c31b966b36 - -Input/output options - input : /tmp/invalid_greetings.csv - outdir : test-results - -Core Nextflow options - runName : stupefied_poincare - containerEngine : docker - profile : docker - -!! Only displaying parameters that differ from the pipeline defaults !! ------------------------------------------------------- ERROR ~ Validation of pipeline parameters failed! -- Check '.nextflow.log' file for details The following invalid input values have been detected: +* Missing required parameter(s): batch * --input (/tmp/invalid_greetings.csv): Validation of file failed: - -> Entry 1: Missing required field(s): greeting + -> Entry 1: Missing required field(s): greeting + -> Entry 2: Missing required field(s): greeting + -> Entry 3: Missing required field(s): greeting - -- Check script 'subworkflows/nf-core/utils_nfschema_plugin/main.nf' at line: 39 or see '.nextflow.log' file for more details + -- Check script 'subworkflows/nf-core/utils_nfschema_plugin/main.nf' at line: 68 or see '.nextflow.log' file for more details ``` Perfect! The validation caught the error and provided a clear, helpful error message pointing to: @@ -589,7 +691,7 @@ The schema validation ensures that input files have the correct structure before ### Takeaway -You now know how to implement and test both parameter validation and input data validation. Your pipeline validates inputs before execution, providing fast feedback and clear error messages. +You've implemented and tested both parameter validation and input data validation. Your pipeline now validates inputs before execution, providing fast feedback and clear error messages. !!! tip "Further reading" @@ -597,40 +699,8 @@ You now know how to implement and test both parameter validation and input data --- -## Congratulations! - -You've completed the Hello nf-core training course! 🎉 - -Throughout this course, you've learned how to: - -- **Run nf-core pipelines** using test profiles and understand their structure -- **Create nf-core-style pipelines** from scratch using the nf-core template -- **Make workflows composable** with `take`, `main`, and `emit` blocks -- **Integrate nf-core modules** from the community repository -- **Implement parameter validation** to catch configuration errors before pipeline execution -- **Implement input data validation** to ensure sample sheets and input files are properly formatted -- **Use nf-schema tools** to manage validation schemas and test validation rules -- **Follow nf-core conventions** for code organization, configuration, and documentation - -You now have the foundational knowledge to develop production-ready Nextflow pipelines that follow nf-core best practices. Your pipeline includes proper module organization, comprehensive validation, and is ready to be extended with additional features. - -### Where to go from here - -Ready to take your skills further? Here are some recommended next steps: - -- **[nf-core website](https://nf-co.re/)**: Explore the full catalog of nf-core pipelines and modules -- **[nf-core documentation](https://nf-co.re/docs/)**: Deep dive into pipeline development guidelines and best practices -- **[nf-schema documentation](https://nextflow-io.github.io/nf-schema/latest/)**: Learn advanced validation techniques -- **[nf-test](https://www.nf-test.com/)**: Add comprehensive testing to your pipeline -- **[Nextflow patterns](https://nextflow-io.github.io/patterns/)**: Discover common workflow patterns and solutions -- **[Side Quests](../side_quests/index.md)**: Explore advanced Nextflow topics like metadata handling, debugging, and workflow composition - -### Get involved with the community - -The nf-core community is welcoming and always happy to help: +## What's next? -- **[nf-core Slack](https://nf-co.re/join/slack)**: Join the community to ask questions and share your work -- **[GitHub Discussions](https://github.com/nf-core/modules/discussions)**: Participate in discussions about modules and pipelines -- **[Contribute](https://nf-co.re/docs/contributing/overview)**: Consider contributing your own modules or improvements back to the community +You've completed all five parts of the Hello nf-core training course! -Thank you for completing this training. We hope you enjoyed learning about nf-core and feel confident building your own pipelines. Happy pipelining! 🚀 +Continue to the [Summary](summary.md) to reflect on what you've built and learned. diff --git a/docs/hello_nf-core/img/module-search-results.png b/docs/hello_nf-core/img/module-search-results.png new file mode 100644 index 0000000000..149e9a3566 Binary files /dev/null and b/docs/hello_nf-core/img/module-search-results.png differ diff --git a/docs/hello_nf-core/img/nf-core_demo_code_organization.svg b/docs/hello_nf-core/img/nf-core_demo_code_organization.svg new file mode 100644 index 0000000000..c0aafa5280 --- /dev/null +++ b/docs/hello_nf-core/img/nf-core_demo_code_organization.svg @@ -0,0 +1,5 @@ + + +subworkflows/workflows/demo.nffastqc/main.nfmultiqc/main.nfseqtk/trim/main.nfmain.nfincludeincludemodules/nf-core/local/utils_nfcore_demo_pipeline/main.nfnf-core/utils_*/main.nf \ No newline at end of file diff --git a/docs/hello_nf-core/img/schema_add.png b/docs/hello_nf-core/img/schema_add.png new file mode 100644 index 0000000000..5615bd86fb Binary files /dev/null and b/docs/hello_nf-core/img/schema_add.png differ diff --git a/docs/hello_nf-core/img/schema_build.png b/docs/hello_nf-core/img/schema_build.png new file mode 100644 index 0000000000..ed5eb43cf0 Binary files /dev/null and b/docs/hello_nf-core/img/schema_build.png differ diff --git a/docs/hello_nf-core/index.md b/docs/hello_nf-core/index.md index 529a0a9c41..79a9432808 100644 --- a/docs/hello_nf-core/index.md +++ b/docs/hello_nf-core/index.md @@ -10,23 +10,30 @@ hide: ![nf-core logo](./img/nf-core-logo.png) -These pipelines are designed to be modular, scalable, and portable, allowing researchers to easily adapt and execute them using their own data and compute resources. -The best practices guidelines enforced by the project further ensure that the pipelines are robust, well-documented, and validated against real-world datasets. This helps to increase the reliability and reproducibility of scientific analyses and ultimately enables researchers to accelerate their scientific discoveries. +The pipelines developed by the nf-core community are designed to be modular, scalable, and portable, allowing researchers to easily adapt and execute them using their own data and compute resources. +The best practices guidelines enforced by the project further ensure that the pipelines are robust, well-documented, and validated against real-world datasets. +This helps to increase the reliability and reproducibility of scientific analyses and ultimately enables researchers to accelerate their scientific discoveries. -During this training, you will be introduced to nf-core in a series of hands-on exercises. +During this training, you will be introduced to nf-core in a series of hands-on exercises as described further below. **Additional information:** You can learn more about the project's origins and governance at https://nf-co.re/about. **Reference publication:** nf-core is published in Nature Biotechnology: [Nat Biotechnol 38, 276–278 (2020). Nature Biotechnology](https://www.nature.com/articles/s41587-020-0439-x). An updated preprint is available at [bioRxiv](https://www.biorxiv.org/content/10.1101/2024.05.10.592912v1). -**Let's get started!** Click on the "Open in GitHub Codespaces" button below to launch the training environment (preferably in a separate tab), then read on while it loads. +## Audience & prerequisites + +This training is intended for learners who have at least basic Nextflow skills and wish to level up to using nf-core resources and best practices in their work. + +**Prerequisites** -[![Open in GitHub Codespaces](https://github.com/codespaces/badge.svg)](https://codespaces.new/nextflow-io/training?quickstart=1&ref=master) +- A GitHub account OR a local installation as described [here](../envsetup/02_local). +- Experience with command line and basic scripting. +- Completed the [Hello Nextflow](../hello_nextflow/index.md) course or equivalent. ## Learning objectives -You will learn to use and develop nf-core compatible modules and pipelines, and utilize nf-core tooling effectively. +You will learn to use and develop nf-core compatible modules and pipelines, and to utilize nf-core tooling effectively. By the end of this training, you will be able to: @@ -34,15 +41,43 @@ By the end of this training, you will be able to: - Describe the code structure and project organization of nf-core pipelines - Create a basic nf-core compatible pipeline from a template - Convert basic Nextflow modules to nf-core compatible modules -- Manage inputs and parameters using nf-core tooling - Add nf-core modules to an nf-core compatible pipeline +- Validate inputs and parameters using nf-core tooling -## Audience & prerequisites +## Detailed lesson plan -This is a general-purpose training for learners who have at least basic Nextflow skills and wish to level up to using nf-core. +This training course aims to teach you the core concepts for running nf-core-style pipelines. +We won't cover everything there is to know about nf-core pipelines, because nf-core encompasses many features and conventions developed by the community over years. +Instead, we will focus on the essential concepts that will help you get started and understand how nf-core works. -**Prerequisites** +#### Part 1: Run a demo pipeline -- A GitHub account OR a local installation as described [here](../envsetup/02_local). -- Experience with command line and basic scripting. -- Completed [Hello Nextflow](../hello_nextflow/index.md) or equivalent. +First, you'll **run an existing nf-core pipeline** and examine its code structure to get a sense of what makes these pipelines different from basic Nextflow workflows. +The elaborate directory structure, configuration system, and standardized conventions might seem like a lot at first, but the benefits will become clear as you learn to decode and utilize these resources effectively. + +#### Part 2: Rewrite Hello for nf-core + +Next, you'll **adapt an existing workflow to the nf-core template scaffold**, starting from the simple workflow produced in the [Hello Nextflow](../hello_nextflow/index.md) course. +Many pipeline development efforts start from existing code, so learning how to restructure an existing workflow to leverage nf-core's nested workflow system is a practical skill you're likely to use repeatedly in your work. + +#### Part 3: Use an nf-core module + +Then you'll discover one of nf-core's biggest advantages: the **community modules library**. +Instead of writing every process from scratch, you'll learn to integrate pre-built, tested modules that wrap common bioinformatics tools. +This approach saves time and ensures consistency across pipelines. + +#### Part 4: Make an nf-core module + +Of course, the modules library doesn't have everything, so you'll also learn to **create your own nf-core-style module**. +You'll learn to work with the specific structure, naming conventions, and metadata requirements that make modules shareable and maintainable by the community. + +#### Part 5: Add input validation + +Finally, you'll implement **input validation** for both command-line parameters and input data files using nf-schema. +This catches errors before pipelines start to run, providing fast feedback and clear error messages. This type of upfront validation makes pipelines more robust and easier to use. + +**By the end of the course, you'll have transformed a basic Nextflow workflow into an nf-core-style pipeline with standardized structure, reusable components, and robust validation.** + +Ready to take the course? + +[Start learning :material-arrow-right:](00_orientation.md){ .md-button .md-button--primary } diff --git a/docs/hello_nf-core/summary.md b/docs/hello_nf-core/summary.md new file mode 100644 index 0000000000..1410029a56 --- /dev/null +++ b/docs/hello_nf-core/summary.md @@ -0,0 +1,43 @@ +# Summary + +Congratulations on completing the Hello nf-core training course! 🎉 + +## Your journey + +You started with a simple Nextflow workflow from the Hello Nextflow course: a straightforward pipeline that processed greetings through a few steps and added some ASCII art. +Through five parts, you've transformed that basic workflow into a production-ready nf-core pipeline. + +### What you built + +Your final `core-hello` pipeline now has: + +- **Standardized structure** using the nf-core template with organized directories for workflows, subworkflows, modules, and configuration +- **Community modules** from the nf-core repository (`cat/cat`) alongside your custom modules +- **Comprehensive validation** that checks both parameters and input data before the pipeline runs +- **Professional configuration** with profiles for different execution environments +- **Complete documentation** and metadata following nf-core conventions + +### Key skills acquired + +Through this hands-on course, you've learned to: + +1. **Navigate and understand** nf-core pipeline structure by exploring an existing pipeline +2. **Restructure workflows** to be composable and fit within the nf-core template +3. **Find and integrate** pre-built modules from the community repository +4. **Create custom modules** following nf-core standards for naming, structure, and metadata +5. **Implement validation** using nf-schema to catch errors early with clear feedback + +## From research script to production pipeline + +The transformation you've made illustrates the difference between a research script and a production pipeline. +Your original Hello Nextflow workflow worked fine for its purpose, but the nf-core version you've built is: + +- **More maintainable**: Standardized structure makes it easy for others (and future you) to understand +- **More reusable**: Modules can be shared across pipelines and with the community +- **More robust**: Validation catches errors before wasting compute time +- **Better documented**: Clear conventions for configuration and parameter descriptions +- **Community-ready**: Follows standards that make collaboration and contribution possible + +You're now equipped with the foundational knowledge to build production-ready nf-core pipelines that follow community best practices. + +**Thank you for completing this training. Happy pipelining!** 🚀 diff --git a/docs/side_quests/index.md b/docs/side_quests/index.md index 2eb905de89..60e52180d2 100644 --- a/docs/side_quests/index.md +++ b/docs/side_quests/index.md @@ -32,7 +32,6 @@ Otherwise, select a side quest from the table below. | -------------------------------------------------------------------------- | -------------------------- | | [Nextflow development environment walkthrough](./ide_features.md) | 45 mins | | [Essential Nextflow Scripting Patterns](./essential_scripting_patterns.md) | 90 mins | -| [Introduction to nf-core](./nf-core.md) | - | | [Metadata in workflows](./metadata.md) | 45 mins | | [Splitting and Grouping](./splitting_and_grouping.md) | 45 mins | | [Testing with nf-test](./nf-test.md) | 1 hour | diff --git a/docs/side_quests/nf-core.md b/docs/side_quests/nf-core.md deleted file mode 100644 index 6dceda9361..0000000000 --- a/docs/side_quests/nf-core.md +++ /dev/null @@ -1,1441 +0,0 @@ -# Introduction to nf-core - -nf-core is a community effort to develop and maintain a curated set of analysis pipelines built using Nextflow. It was created by several core facilities wanting to consolidate their analysis development and is governed by community members from academia and industry. It is an open community that anyone can join and contribute to. - -![nf-core logo](./img/nf-core/nf-core-logo.png) - -nf-core provides a standardized set of best practices, guidelines, and templates for building and sharing scientific pipelines. -These pipelines are designed to be modular, scalable, and portable, allowing researchers to easily adapt and execute them using their own data and compute resources. - -One of the key benefits of nf-core is that it promotes open development, testing, and peer review, ensuring that the pipelines are robust, well-documented, and validated against real-world datasets. -This helps to increase the reliability and reproducibility of scientific analyses and ultimately enables researchers to accelerate their scientific discoveries. - -nf-core is published in Nature Biotechnology: [Nat Biotechnol 38, 276–278 (2020). Nature Biotechnology](https://www.nature.com/articles/s41587-020-0439-x). An updated preprint is available at [bioRxiv](https://www.biorxiv.org/content/10.1101/2024.05.10.592912v1). - -In this tutorial you will explore using and writing nf-core pipelines: - -- Section 1: Run nf-core pipeline - In the first section, you will learn where you can find information about a particular nf-core pipeline and how to run one with provided test data. -- Section 2: Develop an nf-core-like pipeline - In second section, you will use a simplified version of the nf-core template to write a nf-core-style pipeline. The pipeline consists of two modules to process FastQ data: `fastqe` and `seqtk`. It uses an input from a sample sheet, validates it, and produces a multiqc report. - ---- - -## 0. Warmup - -Let's move into the project directory. - -```bash -cd side-quests/nf-core -``` - -The `nf-core` directory has the file content like: - -```console title="Directory contents" -nf-core -└── data - └── sequencer_samplesheet.csv -``` - -We will first run a pipeline in this directory and then build our own. We need the `sequencer_samplesheet.csv` for part 2. For now you can ignore it. - -## 1. Run nf-core pipelines - -nf-core uses their website [nf-co.re](https://nf-co.re) to centrally display all information such as: general documentation and help articles, documentation for each of its pipelines, blog posts, event annoucenments, etc.. - -### 1.1 nf-core website - -Each released pipeline has a dedicated page that includes 6 documentation sections: - -- **Introduction:** An introduction and overview of the pipeline -- **Usage:** Descriptions of how to execute the pipeline -- **Parameters:** Grouped pipeline parameters with descriptions -- **Output:** Descriptions and examples of the expected output files -- **Results:** Example output files generated from the full test dataset -- **Releases & Statistics:** Pipeline version history and statistics - -You should read the pipeline documentation carefully to understand what a given pipeline does and how it can be configured before attempting to run it. - -Go to the nf-core website and find the documentation for the [nf-core/demo pipeline](https://nf-co.re/demo/). - -Find out: - -- which tools the pipeline will run (Check the tab: `Introduction`) -- which parameters the pipeline has (Check the tab: `Parameters`) -- what the output files (Check the tab: `Output`) - -#### Takeaway - -You know where to find information about a particular nf-core pipeline: where to find general information, where the parameters are described, and where you can find a description on the output that the pipelines produce. - -#### What's next? - -Next, we'll show you how to run your first nf-core pipeline. - -### 1.2 Running an nf-core pipeline - -Let's start by creating a new subdirectory to run the pipeline in: - -```bash -mkdir nf-core-demo -cd nf-core-demo -``` - -!!!tip - - You can run this from anywhere, but by creating a new folder all logs and output files that will be generated are bundled in one place. - -Whenever you're ready, run the command: - -```bash -nextflow pull nf-core/demo -``` - -Nextflow will `pull` the pipeline code. - -```console title="Output" -Checking nf-core/demo ... - downloaded from https://github.com/nf-core/demo.git - revision: 04060b4644 [master] -``` - -To be clear, you can do this with any Nextflow pipeline that is appropriately set up in GitHub, not just nf-core pipelines. -However nf-core is the largest open curated collection of Nextflow pipelines. - -Now that we've got the pipeline pulled, we can try running it! - -#### 1.2.1 Trying out an nf-core pipeline with the test profile - -Conveniently, every nf-core pipeline comes with a `test` profile. -This is a minimal set of configuration settings for the pipeline to run using a small test dataset that is hosted on the [nf-core/test-datasets](https://github.com/nf-core/test-datasets) repository. It's a great way to try out a pipeline at small scale. - -The `test` profile for `nf-core/demo` is shown below: - -```groovy title="conf/test.config" linenums="1" hl_lines="26" -/* -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - Nextflow config file for running minimal tests -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - Defines input files and everything required to run a fast and simple pipeline test. - - Use as follows: - nextflow run nf-core/demo -profile test, --outdir - ----------------------------------------------------------------------------------------- -*/ - -process { - resourceLimits = [ - cpus: 4, - memory: '15.GB', - time: '1.h' - ] -} - -params { - config_profile_name = 'Test profile' - config_profile_description = 'Minimal test dataset to check pipeline function' - - // Input data - input = 'https://raw.githubusercontent.com/nf-core/test-datasets/viralrecon/samplesheet/samplesheet_test_illumina_amplicon.csv' - -} -``` - -This tells us that the `nf-core/demo` `test` profile already specifies the input parameter, so you don't have to provide any input yourself. -However, the `outdir` parameter is not included in the `test` profile, so you have to add it to the execution command using the `--outdir` flag. - -Here, we're also going to specify `-profile docker`, which by nf-core convention enables the use of Docker. - -Lets' try it! - -```bash -nextflow run nf-core/demo -profile docker,test --outdir results -``` - -Here's the console output from the pipeline: - -```console title="Output" - N E X T F L O W ~ version 24.10.0 - -Launching `https://github.com/nf-core/demo` [maniac_jones] DSL2 - revision: 04060b4644 [master] - - ------------------------------------------------------- - ,--./,-. - ___ __ __ __ ___ /,-._.--~' - |\ | |__ __ / ` / \ |__) |__ } { - | \| | \__, \__/ | \ |___ \`-._,-`-, - `._,._,' - nf-core/demo 1.0.1 ------------------------------------------------------- -Input/output options - input : https://raw.githubusercontent.com/nf-core/test-datasets/viralrecon/samplesheet/samplesheet_test_illumina_amplicon.csv - outdir : results - -Institutional config options - config_profile_name : Test profile - config_profile_description: Minimal test dataset to check pipeline function - -Core Nextflow options - revision : master - runName : maniac_jones - containerEngine : docker - launchDir : /workspaces/training/side-quests/nf-core/nf-core-demo - workDir : /workspaces/training/side-quests/nf-core/nf-core-demo/work - projectDir : /workspaces/.nextflow/assets/nf-core/demo - userName : gitpod - profile : docker,test - configFiles : - -!! Only displaying parameters that differ from the pipeline defaults !! -------------------------------------------------------* The pipeline - https://doi.org/10.5281/zenodo.12192442 - -* The nf-core framework - https://doi.org/10.1038/s41587-020-0439-x - -* Software dependencies - https://github.com/nf-core/demo/blob/master/CITATIONS.md - -executor > local (7) -[3c/a00024] NFC…_DEMO:DEMO:FASTQC (SAMPLE2_PE) | 3 of 3 ✔ -[94/d1d602] NFC…O:DEMO:SEQTK_TRIM (SAMPLE2_PE) | 3 of 3 ✔ -[ab/460670] NFCORE_DEMO:DEMO:MULTIQC | 1 of 1 ✔ --[nf-core/demo] Pipeline completed successfully- -Completed at: 05-Mar-2025 09:46:21 -Duration : 1m 54s -CPU hours : (a few seconds) -Succeeded : 7 -``` - -Isn't that neat? - -You can also explore the `results` directory produced by the pipeline. - -```console title="Output" -results/ -├── fastqc -│ ├── SAMPLE1_PE -│ ├── SAMPLE2_PE -│ └── SAMPLE3_SE -├── fq -│ ├── SAMPLE1_PE -│ ├── SAMPLE2_PE -│ └── SAMPLE3_SE -├── multiqc -│ ├── multiqc_data -│ ├── multiqc_plots -│ └── multiqc_report.html -└── pipeline_info - ├── execution_report_2025-03-05_09-44-26.html - ├── execution_timeline_2025-03-05_09-44-26.html - ├── execution_trace_2025-03-05_09-44-26.txt - ├── nf_core_pipeline_software_mqc_versions.yml - ├── params_2025-03-05_09-44-29.json - └── pipeline_dag_2025-03-05_09-44-26.html -``` - -If you're curious about what that all means, check out [the nf-core/demo pipeline documentation page](https://nf-co.re/demo/1.0.1/)! - -And that's all you need to know for now. -Congratulations! You have now run your first nf-core pipeline. - -#### Takeaway - -You know how to run an nf-core pipeline using its built-in test profile. - -#### What's next? - -Celebrate and take a break! Next, we'll show you how to use nf-core tooling to build your own pipeline. - -## 2. Create a basic pipeline from template - -We will now start developing our own nf-core style pipeline. -The nf-core collection currently offers, [72 subworkflows](https://nf-co.re/subworkflows/) and [over 1300 modules](https://nf-co.re/modules/) that you can use to build your own pipelines. Subworkflows are 'composable' workflows, such as those you may have encountered in the [Workflows of workflows side quest](./workflows_of_workflows.md), providing ready-made chunks of logic you can you can use in your own worklfows. - -The nf-core community provides a [command line tool](https://nf-co.re/docs/nf-core-tools) with helper functions to use and develop pipelines, including to install those components. - -We have pre-installed nf-core tools, and here, we will use them to create and develop a new pipeline. - -View all of the tooling using the `nf-core --help` argument. - -```bash -nf-core --help -``` - -### 2.1 Creating your pipeline - -Before we start, let's create a new subfolder in the current `nf-core` directory: - -``` -cd .. -mkdir nf-core-pipeline -cd nf-core-pipeline -``` - -!!! hint "Open a new window in VSCode" - - If you are working with VS Code you can open a new window to reduce visual clutter: - - ```bash - code . - ``` - -Let's start by creating a new pipeline with the `nf-core pipelines create` command: - -All nf-core pipelines are based on a common template, a standardized pipeline skeleton that can be used to streamline development with shared features and components. - -The `nf-core pipelines create` command creates a new pipeline using the nf-core base template with a pipeline name, description, and author. It is the first and most important step for creating a pipeline that will integrate with the wider Nextflow ecosystem. - -```bash -nf-core pipelines create -``` - -Running this command will open a Text User Interface (TUI) for pipeline creation. - -
- -
- -Template features can be flexibly included or excluded at the time of creation, follow these steps create your first pipeline using the `nf-core pipelines create` TUI: - -1. Run the `nf-core pipelines create` command -2. Select **Let's go!** on the welcome screen -3. Select **Custom** on the Choose pipeline type screen -4. Enter your pipeline details, replacing < YOUR NAME > with your own name, then select **Next** - -- **GitHub organisation:** myorg -- **Workflow name:** myfirstpipeline -- **A short description of your pipeline:** My first pipeline -- **Name of the main author / authors:** < YOUR NAME > - -5. On the Template features screen, set "Toggle all features" to **off**, then **enable**: - -- `Add configuration files` -- `Use multiqc` -- `Use nf-core components` -- `Use nf-schema` -- `Add documentation` -- `Add testing profiles` - -6. Select **Finish** on the Final details screen -7. Wait for the pipeline to be created, then select **Continue** -8. Select **Finish without creating a repo** on the Create GitHub repository screen -9. Select **Close** on the HowTo create a GitHub repository page - -If run successfully, you will see a new folder in your current directory named `myorg-myfirstpipeline`. - -#### 2.1.1 Testing your pipeline - -Let's try to run our new pipeline: - -```bash -cd myorg-myfirstpipeline -nextflow run . -profile docker,test --outdir results -``` - -The pipeline should run successfully! - -Here's the console output from the pipeline: - -```console title="Output" - N E X T F L O W ~ version 24.10.0 - -Launching `./main.nf` [infallible_kilby] DSL2 - revision: fee0bcf390 - -Downloading plugin nf-schema@2.3.0 -Input/output options - input : https://raw.githubusercontent.com/nf-core/test-datasets/viralrecon/samplesheet/samplesheet_test_illumina_amplicon.csv - outdir : results - -Institutional config options - config_profile_name : Test profile - config_profile_description: Minimal test dataset to check pipeline function - -Generic options - trace_report_suffix : 2025-03-05_10-17-59 - -Core Nextflow options - runName : infallible_kilby - containerEngine : docker - launchDir : /workspaces/training/side-quests/nf-core/nf-core-pipeline/myorg-myfirstpipeline - workDir : /workspaces/training/side-quests/nf-core/nf-core-pipeline/myorg-myfirstpipeline/work - projectDir : /workspaces/training/side-quests/nf-core/nf-core-pipeline/myorg-myfirstpipeline - userName : gitpod - profile : docker,test - configFiles : /workspaces/training/side-quests/nf-core/nf-core-pipeline/myorg-myfirstpipeline/nextflow.config - -!! Only displaying parameters that differ from the pipeline defaults !! ------------------------------------------------------- -executor > local (1) -[02/510003] MYO…PELINE:MYFIRSTPIPELINE:MULTIQC | 1 of 1 ✔ --[myorg/myfirstpipeline] Pipeline completed successfully- -``` - -Let's dissect what we are seeing. - -The nf-core pipeline template is a working pipeline and comes preconfigured with some modules. Here, we only run [MultiQC](https://multiqc.info/) - -At the top, you see all parameters displayed that differ from the pipeline defaults. Most of these are default or were set by applying the `test` profile. - -Additionally we used the `docker` profile to use docker for software packaging. nf-core provides this as a profile for convenience to enable the docker feature but we could do it with configuration as we did with the earlier module. - -#### 2.1.2 Template tour - -The nf-core pipeline template comes packed with a lot of files and folders. While creating the pipeline, we selected a subset of the nf-core features. The features we selected are now included as files and directories in our repository. - -While the template may feel overwhelming, a complete understanding isn't required to start developing your pipeline. Let's look at the important places that we need to touch during pipeline development. - -##### Workflows, subworkflows, and modules - -The nf-core pipeline template has a `main.nf` script that calls `myfirstpipeline.nf` from the `workflows` folder. The `myfirstpipeline.nf` file inside the workflows folder is the central pipeline file that is used to bring everything else together. - -Instead of having one large monolithic pipeline script, it's broken up into smaller script components, namely, modules and subworkflows: - -- **Modules:** Wrappers around a single process -- **Subworkflows:** Two or more modules that are packaged together as a mini workflow - -
- --8<-- "docs/side_quests/img/nf-core/nested.excalidraw.svg" -
- -Within your pipeline repository, `modules` and `subworkflows` are stored within `local` and `nf-core` folders. The `nf-core` folder is for components that have come from the nf-core GitHub repository while the `local` folder is for components that have been developed independently (usually things very specific to a pipeline): - -```console -modules/ -├── local -│ └── .nf -│ . -│ -└── nf-core - ├── - │ ├── environment.yml - │ ├── main.nf - │ ├── meta.yml - │ └── tests - │ ├── main.nf.test - │ ├── main.nf.test.snap - │ └── tags.yml - . -``` - -Modules from nf-core follow a similar structure and contain a small number of additional files for testing using [nf-test](https://www.nf-test.com/) and documentation about the module. - -!!!note - - Some nf-core modules are also split into command specific directories: - - ```console - │ - └── - └── - ├── environment.yml - ├── main.nf - ├── meta.yml - └── tests - ├── main.nf.test - ├── main.nf.test.snap - ├── nextflow.config - └── tags.yml - ``` - -!!!note - - The nf-core template does not come with a local modules folder by default. - -##### Configuration files - -The nf-core pipeline template utilizes Nextflow's flexible customization options and has a series of configuration files throughout the template. - -In the template, the `nextflow.config` file is a central configuration file and is used to set default values for parameters and other configuration options. The majority of these configuration options are applied by default while others (e.g., software dependency profiles) are included as optional profiles. - -There are several configuration files that are stored in the `conf` folder and are added to the configuration by default or optionally as profiles: - -- `base.config`: A 'blank slate' config file, appropriate for general use on most high-performance computing environments. This defines broad bins of resource usage, for example, which are convenient to apply to modules. -- `modules.config`: Additional module directives and arguments. -- `test.config`: A profile to run the pipeline with minimal test data. -- `test_full.config`: A profile to run the pipeline with a full-sized test dataset. - -##### `nextflow_schema.json` - -The `nextflow_schema.json` is a file used to store parameter related information including type, description and help text in a machine readable format. The schema is used for various purposes, including automated parameter validation, help text generation, and interactive parameter form rendering in UI interfaces. - -##### `assets/schema_input.json` - -The `schema_input.json` is a file used to define the input samplesheet structure. Each column can have a type, pattern, description and help text in a machine readable format. The schema is used for various purposes, including automated validation, and providing helpful error messages. - -#### Takeaway - -You used the nf-core tooling to create a template pipeline. You customized it with components you wanted to use for this pipeline focusing on a handful important ones. You also learned about each of the pieces you have installed and have a general idea of the locations of important files. Lastly, you checked that the template pipeline works by running it as is. - -#### What's next? - -Congratulations and take a break! In the next step, we will investigate the default input data, that the pipeline comes with. - ---- - -### 2.2 Check the input data - -Above, we said that the `test` profile comes with small test files that are stored in the nf-core. Let's check what type of files we are dealing with to plan our expansion. Remember that we can inspect any channel content using the `view` operator: - -```groovy title="workflows/myfirstpipeline.nf" linenums="27" -ch_samplesheet.view() -``` - -!!!note - - nf-core is making heavy use of more complex workflow encapsulation. The `main.nf` that you used in the hello-series imports and calls the workflow in the file `workflows/myfirstpipeline.nf`. This is the file we will work in today. - -and the run command: - -```bash -nextflow run . -profile docker,test --outdir results -``` - -The output should look like the below. We see that we have FASTQ files as input and each set of files is accompanied by some metadata: the `id` and whether or not they are single end: - -```console title="Output" -[['id':'SAMPLE1_PE', 'single_end':false], [/nf-core/test-datasets/viralrecon/illumina/amplicon/sample1_R1.fastq.gz, /nf-core/test-datasets/viralrecon/illumina/amplicon/sample1_R2.fastq.gz]] -[['id':'SAMPLE2_PE', 'single_end':false], [/nf-core/test-datasets/viralrecon/illumina/amplicon/sample2_R1.fastq.gz, /nf-core/test-datasets/viralrecon/illumina/amplicon/sample2_R2.fastq.gz]] -[['id':'SAMPLE3_SE', 'single_end':true], [/nf-core/test-datasets/viralrecon/illumina/amplicon/sample1_R1.fastq.gz, /nf-core/test-datasets/viralrecon/illumina/amplicon/sample2_R1.fastq.gz]] -``` - -You can comment the `view` statement for now. We will use it later during this training to inspect the channel content again. - -#### Takeaway - -The pipeline template comes with a default samplesheet. You learned what is part of this samplesheet so you can use it in the next steps when we want to add and run modules in the pipeline. - -#### What's next? - -In the next step you will start adding your first nf-core module to the pipeline: seqtk. - ---- - -### 2.3 Add an nf-core module - -nf-core provides a large library of modules and subworkflows: pre-made nextflow wrappers around tools that can be installed into nextflow pipelines. They are designed to be flexible but may require additional configuration to suit different use cases. - -Currently, there are more than [1400 nf-core modules](https://nf-co.re/modules) and [70 nf-core subworkflows](https://nf-co.re/subworkflows) (March 2025) available. Modules and subworkflows can be listed, installed, updated, removed, and patched using nf-core tooling. - -While you could develop a module for this tool independently, you can save a lot of time and effort by leveraging nf-core modules and subworkflows. - -Let's see which modules are available: - -```console -nf-core modules list remote -``` - -This command lists all currently available modules, > 1400. An easier way to find them is to go to the nf-core website and visit the modules subpage [https://nf-co.re/modules](https://nf-co.re/modules). Here you can search for modules by name or tags, find documentation for each module, and see which nf-core pipeline are using the module: - -![nf-core/modules](./img/nf-core/nf-core-modules.png) - -#### 2.3.1 Install an nf-core module - -Now let's add another tool to the pipeline. - -`Seqtk` is a fast and lightweight tool for processing sequences in the FASTA or FASTQ format. Here, you will use the [`seqtk trim`](https://github.com/lh3/seqtk) command to trim FASTQ files. - -In your pipeline, you will add a new step that will take FASTQ files from the sample sheet as inputs and will produce trimmed fastq files that can be used as an input for other tools and version information about the seqtk tools to mix into the inputs for the MultiQC process. - -
- --8<-- "docs/side_quests/img/nf-core/pipeline.excalidraw.svg" -
- -The `nf-core modules install` command can be used to install the `seqtk/trim` module directly from the nf-core repository: - -``` -nf-core modules install -``` - -!!!warning - - You need to be in the myorg-myfirstpipeline directory when executing `nf-core modules install` - -You can follow the prompts to find and install the module you are interested in: - -```console -? Tool name: seqtk/trim -``` - -Once selected, the tooling will install the module in the `modules/nf-core/` folder and suggest code that you can add to your main workflow file (`workflows/myfirstpipeline.nf`). - -```console -INFO Installing 'seqtk/trim' -INFO Use the following statement to include this module: - -include { SEQTK_TRIM } from '../modules/nf-core/seqtk/trim/main' -``` - -To enable reporting and reproducibility, modules and subworkflows from the nf-core repository are tracked using hashes in the `modules.json` file. When modules are installed or removed using the nf-core tooling the `modules.json` file will be automatically updated. - -When you open the `modules.json`, you will see an entry for each module that is currently installed from the nf-core modules repository. You can open the file with the VS Code user interface by clicking on it in `myorg-myfirstpipeline/modules.json`: - -```console -"nf-core": { - "multiqc": { - "branch": "master", - "git_sha": "cf17ca47590cc578dfb47db1c2a44ef86f89976d", - "installed_by": ["modules"] - }, - "seqtk/trim": { - "branch": "master", - "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", - "installed_by": ["modules"] - } -} -``` - -#### 2.3.2 Add the module to your pipeline - -Although the module has been installed in your local pipeline repository, it is not yet added to your pipeline. - -The suggested `include` statement needs to be added to your `workflows/myfirstpipeline.nf` file and the process call (with inputs) needs to be added to the workflow block. - -```groovy title="workflows/myfirstpipeline.nf" linenums="6" -include { SEQTK_TRIM } from '../modules/nf-core/seqtk/trim/main' -include { MULTIQC } from '../modules/nf-core/multiqc/main' -``` - -To add the `SEQTK_TRIM` module to your workflow you will need to check what inputs are required. - -You can view the input channels for the module by opening the `./modules/nf-core/seqtk/trim/main.nf` file. - -```groovy title="modules/nf-core/seqtk/trim/main.nf" linenums="11" -input: -tuple val(meta), path(reads) -``` - -Each nf-core module also has a `meta.yml` file which describes the inputs and outputs. This meta file is rendered on the [nf-core website](https://nf-co.re/modules/seqtk_trim), or can be viewed using the `nf-core modules info` command: - -```console -nf-core modules info seqtk/trim -``` - -It outputs a table with all defined inputs and outputs of the module: - -```console title="Output" - -╭─ Module: seqtk/trim ─────────────────────────────────────────────────────────────────────────────╮ -│ Location: modules/nf-core/seqtk/trim │ -│ 🔧 Tools: seqtk │ -│ 📖 Description: Trim low quality bases from FastQ files │ -╰───────────────────────────────────────────────────────────────────────────────────────────────────╯ - ╷ ╷ - 📥 Inputs │Description │ Pattern -╺━━━━━━━━━━━━━━┿━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┿━━━━━━━━━━━━╸ - input[0] │ │ -╶──────────────┼───────────────────────────────────────────────────────────────────────┼────────────╴ - meta (map) │Groovy Map containing sample information e.g. [ id:'test', │ - │single_end:false ] │ -╶──────────────┼───────────────────────────────────────────────────────────────────────┼────────────╴ - reads (file)│List of input FastQ files │*.{fastq.gz} - ╵ ╵ - ╷ ╷ - 📥 Outputs │Description │ Pattern -╺━━━━━━━━━━━━━━━━━━━━━┿━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┿━━━━━━━━━━━━╸ - reads │ │ -╶─────────────────────┼────────────────────────────────────────────────────────────────┼────────────╴ - meta (map) │Groovy Map containing sample information e.g. [ id:'test', │ - │single_end:false ] │ -╶─────────────────────┼────────────────────────────────────────────────────────────────┼────────────╴ - *.fastq.gz (file) │Filtered FastQ files │*.{fastq.gz} -╶─────────────────────┼────────────────────────────────────────────────────────────────┼────────────╴ - versions │ │ -╶─────────────────────┼────────────────────────────────────────────────────────────────┼────────────╴ - versions.yml (file)│File containing software versions │versions.yml - ╵ ╵ - - Use the following statement to include this module: - - include { SEQTK_TRIM } from '../modules/nf-core/seqtk/trim/main' -``` - -Using this module information you can work out what inputs are required for the `SEQTK_TRIM` process: - -1. `tuple val(meta), path(reads)` - - - A tuple (basically a fixed-length list) with a meta _map_ (we will talk about meta maps more in the next section) and a list of FASTQ _files_ - - The channel `ch_samplesheet` used by the `FASTQC` process can be used as the reads input. - -Only one input channel is required, and it already exists, so it can be added to your `firstpipeline.nf` file without any additional channel creation or modifications. - -_Before:_ - -```groovy title="workflows/myfirstpipeline.nf" linenums="30" -// -// Collate and save software versions -// -``` - -_After:_ - -```groovy title="workflows/myfirstpipeline.nf" linenums="29" -// -// MODULE: Run SEQTK_TRIM -// -SEQTK_TRIM ( - ch_samplesheet -) -// -// Collate and save software versions -// -``` - -Let's test it: - -```bash -nextflow run . -profile docker,test --outdir results -``` - -```console title="Output" - N E X T F L O W ~ version 24.10.0 - -Launching `./main.nf` [admiring_davinci] DSL2 - revision: fee0bcf390 - -Input/output options - input : https://raw.githubusercontent.com/nf-core/test-datasets/viralrecon/samplesheet/samplesheet_test_illumina_amplicon.csv - outdir : results - -Institutional config options - config_profile_name : Test profile - config_profile_description: Minimal test dataset to check pipeline function - -Generic options - trace_report_suffix : 2025-03-05_10-40-35 - -Core Nextflow options - runName : admiring_davinci - containerEngine : docker - launchDir : /workspaces/training/side-quests/nf-core/nf-core-pipeline/myorg-myfirstpipeline - workDir : /workspaces/training/side-quests/nf-core/nf-core-pipeline/myorg-myfirstpipeline/work - projectDir : /workspaces/training/side-quests/nf-core/nf-core-pipeline/myorg-myfirstpipeline - userName : gitpod - profile : docker,test - configFiles : /workspaces/training/side-quests/nf-core/nf-core-pipeline/myorg-myfirstpipeline/nextflow.config - -!! Only displaying parameters that differ from the pipeline defaults !! ------------------------------------------------------- -executor > local (4) -[a8/d4ccea] MYO…PELINE:SEQTK_TRIM (SAMPLE1_PE) | 3 of 3 ✔ -[fb/d907c3] MYO…PELINE:MYFIRSTPIPELINE:MULTIQC | 1 of 1 ✔ --[myorg/myfirstpipeline] Pipeline completed successfully- -``` - -#### 2.3.4 Inspect results folder - -Default nf-core configuration directs the output of each process into the `/`. After running the previous command, you -should have a `results` folder that looks something like this: - -```console -results/ -├── multiqc -│ ├── multiqc_data -│ └── multiqc_report.html -├── pipeline_info -│ ├── execution_report_2025-03-05_10-17-59.html -│ ├── execution_report_2025-03-05_10-28-16.html -│ ├── execution_report_2025-03-05_10-40-35.html -│ ├── execution_timeline_2025-03-05_10-17-59.html -│ ├── execution_timeline_2025-03-05_10-28-16.html -│ ├── execution_timeline_2025-03-05_10-40-35.html -│ ├── execution_trace_2025-03-05_10-17-59.txt -│ ├── execution_trace_2025-03-05_10-28-16.txt -│ ├── execution_trace_2025-03-05_10-40-35.txt -│ ├── myfirstpipeline_software_mqc_versions.yml -│ ├── params_2025-03-05_10-18-03.json -│ ├── params_2025-03-05_10-28-19.json -│ ├── params_2025-03-05_10-40-37.json -│ ├── pipeline_dag_2025-03-05_10-17-59.html -│ ├── pipeline_dag_2025-03-05_10-28-16.html -│ └── pipeline_dag_2025-03-05_10-40-35.html -└── seqtk - ├── SAMPLE1_PE_sample1_R1.fastq.gz - ├── SAMPLE1_PE_sample1_R2.fastq.gz - ├── SAMPLE2_PE_sample2_R1.fastq.gz - ├── SAMPLE2_PE_sample2_R2.fastq.gz - ├── SAMPLE3_SE_sample1_R1.fastq.gz - └── SAMPLE3_SE_sample2_R1.fastq.gz -``` - -The outputs from the `multiqc` and `seqtk` modules are published in their respective subdirectories. In addition, by default, nf-core pipelines generate a set of reports. These files are stored in the`pipeline_info` subdirectory and time-stamped so that runs don't overwrite each other. - -#### 2.3.5 Handle modules output - -As with the inputs, you can view the outputs for the module by opening the `/modules/nf-core/seqtk/trim/main.nf` file, use the `nf-core modules info seqtk/trim`, or check the `meta.yml`. - -```groovy title="modules/nf-core/seqtk/trim/main.nf" linenums="13" -output: -tuple val(meta), path("*.fastq.gz"), emit: reads -path "versions.yml" , emit: versions -``` - -To help with organization and readability it is beneficial to create named output channels. - -For `SEQTK_TRIM`, the `reads` output could be put into a channel named `ch_trimmed`. - -```groovy title="workflows/myfirstpipeline.nf" linenums="32" -ch_trimmed = SEQTK_TRIM.out.reads -``` - -All nf-core modules have a common output channel: `versions`. The channel contains a file that lists the tool version used in the module. MultiQC can collect all tool versions and print them out in a table in the results folder. This is useful to later track which version was actually run. - -It is beneficial to immediately mix the tool versions into the `ch_versions` channel so they can be used as input for the `MULTIQC` process and passed to the final report. - -```groovy title="workflows/myfirstpipeline.nf" linenums="33" -ch_versions = ch_versions.mix(SEQTK_TRIM.out.versions.first()) -``` - -!!! note - - The `first` operator is used to emit the first item from `SEQTK_TRIM.out.versions` to avoid duplication. - -#### 2.3.6 Add a parameter to the `seqtk/trim` tool - -nf-core modules should be flexible and usable across many different pipelines. Therefore, optional tool parameters are typically not set in an nf-core/module. Instead, additional configuration options on how to run the tool, like its parameters or filename, can be applied to a module using the `conf/modules.config` file on the pipeline level. Process selectors (e.g., `withName`) are used to apply configuration options to modules selectively. Process selectors must be used within the `process` scope. - -The parameters or arguments of a tool can be changed using the directive `args`. You can find many examples of how arguments are added to modules in nf-core pipelines, for example, the nf-core/demo [modules.config](https://github.com/nf-core/demo/blob/master/conf/modules.config) file. - -Add this snippet to your `conf/modules.config` file (using the `process` scope) to call the `seqtk/trim` tool with the argument `-b 5` to trim 5 bp from the left end of each read: - -```console title="conf/modules.config" linenums="21" -withName: 'SEQTK_TRIM' { - ext.args = "-b 5" -} -``` - -Run the pipeline again and check if the new parameter is applied: - -```bash -nextflow run . -profile docker,test --outdir results -``` - -```console title="Output" -[67/cc3d2f] process > MYORG_MYFIRSTPIPELINE:MYFIRSTPIPELINE:SEQTK_TRIM (SAMPLE1_PE) [100%] 3 of 3 ✔ -[b4/a1b41b] process > MYORG_MYFIRSTPIPELINE:MYFIRSTPIPELINE:MULTIQC [100%] 1 of 1 ✔ -``` - -Copy the hash you see in your console output (here `6c/34e549`; it is different for _each_ run). You can `ls` using tab-completion in your `work` directory to expand the complete hash. -In this folder you will find various log files. The `.command.sh` file contains the resolved command: - -```bash -less work/6c/34e549912696b6757f551603d135bb/.command.sh -``` - -We can see, that the parameter `-b 5`, that we set in the `modules.config` is applied to the task: - -```console title="Output" -#!/usr/bin/env bash -C -e -u -o pipefail -printf "%s\n" sample2_R1.fastq.gz sample2_R2.fastq.gz | while read f; -do - seqtk \ - trimfq \ - -b 5 \ - $f \ - | gzip --no-name > SAMPLE2_PE_$(basename $f) -done - -cat <<-END_VERSIONS > versions.yml -"MYORG_MYFIRSTPIPELINE:MYFIRSTPIPELINE:SEQTK_TRIM": - seqtk: $(echo $(seqtk 2>&1) | sed 's/^.*Version: //; s/ .*$//') -END_VERSIONS -``` - -#### Takeaway - -You changed the pipeline template and added the nf-core/module `seqtk` to your pipeline. You then changed the default tool command by editing the `modules.config` for this tool. You also made the output available in the workflow so it can be used by other modules in the pipeline. - -#### What's next? - -In the next step we will add a pipeline parameter to allow users to skip the trimming step run by `seqtk`. - ---- - -### 2.4 Adding parameters to your pipeline - -Any option that a pipeline user may want to configure regularly, whether in the specific modules used or the options passed to them, should be made into a pipeline-level parameter so it can easily be overridden. nf-core defines some standards for providing parameters. - -Here, as a simple example, you will add a new parameter to your pipeline that will skip the `SEQTK_TRIM` process. -That parameter will be accessible in the pipeline script, and we can use it to control how the pipeline runs. - -#### 2.4.1 Default values - -In the nf-core template the default values for parameters are set in the `nextflow.config` in the base repository. - -Any new parameters should be added to the `nextflow.config` with a default value within the `params` scope. - -Parameter names should be unique and easily identifiable. - -We can add a new parameter `skip_trim` to your `nextflow.config` file and set it to `false`. - -```groovy title="nextflow.config" linenums="15" -// Trimming -skip_trim = false -``` - -#### 2.4.2 Using the parameter - -Let's add an `if` statement that is depended on the `skip_trim` parameter to control the execution of the `SEQTK_TRIM` process: - -```groovy title="workflows/myfirstpipeline.nf" linenums="29" - // - // MODULE: Run SEQTK_TRIM - // - if (!params.skip_trim) { - SEQTK_TRIM ( - ch_samplesheet - ) - ch_trimmed = SEQTK_TRIM.out.reads - ch_versions = ch_versions.mix(SEQTK_TRIM.out.versions.first()) - } -``` - -Here, an `if` statement that is depended on the `skip_trim` parameter is used to control the execution of the `SEQTK_TRIM` process. An `!` can be used to imply the logical "not". - -Thus, if the `skip_trim` parameter is **not** `true`, the `SEQTK_TRIM` will be be executed. - -Now your if statement has been added to your main workflow file and has a default setting in your `nextflow.config` file, you will be able to flexibly skip the new trimming step using the `skip_trim` parameter. - -We can now run the pipeline with the new `skip_trim` parameter to check it is working: - -```console -nextflow run . -profile test,docker --outdir results --skip_trim -``` - -You should see that the `SEQTK_TRIM` process has been skipped in your execution: - -```console title="Output" -!! Only displaying parameters that differ from the pipeline defaults !! ------------------------------------------------------- -WARN: The following invalid input values have been detected: - -* --skip_trim: true - - -executor > local (1) -[7b/8b60a0] process > MYORG_MYFIRSTPIPELINE:MYFIRSTPIPELINE:MULTIQC [100%] 1 of 1 ✔ --[myorg/myfirstpipeline] Pipeline completed successfully- -``` - -#### 2.4.3 Validate input parameters - -When we ran the pipeline, we saw a warning message: - -```console -WARN: The following invalid input values have been detected: - -* --skip_trim: true -``` - -Parameters are validated through the `nextflow_schema.json` file. This file is also used by the nf-core website (for example, in [nf-core/mag](https://nf-co.re/mag/3.2.1/parameters/)) to render the parameter documentation and print the pipeline help message (`nextflow run . --help`). If you have added parameters and they have not been documented in the `nextflow_schema.json` file, then the input validation does not recognize the parameter. - -The `nextflow_schema.json` file can get very big and very complicated very quickly, and is hard to manually edit. Fortunately, the `nf-core pipelines schema build` command is designed to support developers write, check, validate, and propose additions to your `nextflow_schema.json` file. - -```console -nf-core pipelines schema build -``` - -This will enable you to launch a web builder to edit this file in your web browser rather than trying to edit this file manually. - -```console -INFO [✓] Default parameters match schema validation -INFO [✓] Pipeline schema looks valid (found 18 params) -✨ Found 'params.skip_trim' in the pipeline config, but not in the schema. Add to pipeline schema? [y/n]: y -INFO Writing schema with 19 params: 'nextflow_schema.json' -🚀 Launch web builder for customization and editing? [y/n]: y -``` - -Using the web builder you can add add details about your new parameters. - -The parameters that you have added to your pipeline will be added to the bottom of the `nf-core pipelines schema build` file. Some information about these parameters will be automatically filled based on the default value from your `nextflow.config`. You will be able to categorize your new parameters into a group, add icons, and add descriptions for each. - -![Pipeline parameters](./img/nf-core/pipeline_schema.png) - -!!!note - - Ungrouped parameters in schema will cause a warning. - -Once you have made your edits you can click `Finished` and all changes will be automatically added to your `nextflow_schema.json` file. - -If you rerun the previous command, the warning should disappear: - -```console -nextflow run . -profile test,docker --outdir results --skip_trim -``` - -```console title="Output" -!! Only displaying parameters that differ from the pipeline defaults !! ------------------------------------------------------- -executor > local (1) -[6c/c78d0c] process > MYORG_MYFIRSTPIPELINE:MYFIRSTPIPELINE:MULTIQC [100%] 1 of 1 ✔ --[myorg/myfirstpipeline] Pipeline completed successfully- -``` - -#### Takeaway - -You added a new parameter to the pipeline. Your pipeline can now run `seqtk` or the user can decide to skip it. You learned how parameters are handeled in nf-core using the JSON schema and how this gives you additional features, such as help text and validation. - -#### What's next? - -In the next step we will take a look at how we track metadata related to an input file. - ---- - -### 2.5 Meta maps - -Datasets often contain additional information relevant to the analysis, such as a sample name, information about sequencing protocols, or other conditions needed in the pipeline to process certain samples together, determine their output name, or adjust parameters. - -By convention, nf-core tracks this information as `meta` maps. These are `key`-`value` pairs that are passed into modules together with the files. We already saw this briefly when inspecting the `input` for `seqtk`: - -```groovy title="modules/nf-core/seqtk/trim/main.nf" linenums="11" -input: -tuple val(meta), path(reads) -``` - -If we uncomment our earlier `view` statement: - -```groovy title="workflows/myfirstpipeline.nf" linenums="28" -ch_samplesheet.view() -``` - -and run the pipeline again, we can see the current content of the `meta` maps: - -```console title="meta map" -[[id:SAMPLE1_PE, single_end:false], ....] -``` - -You can add any field that you require to the `meta` map. By default, nf-core modules expect an `id` field. - -#### Takeaway - -In this section you learned, that a `meta` map is used to pass along additional information for a sample in nf-core. It is a `map` (or dictionary) that allows you to assign arbitray keys to track any information you require in the workflow. - -#### What's next? - -In the next step we will take a look how we can add a new key to the `meta` map using the samplesheet. - ---- - -### 2.6 Simple Samplesheet adaptations - -nf-core pipelines typically use samplesheets as inputs to the pipelines. This allows us to: - -- validate each entry and print specific error messages. -- attach information to each input file. -- track which datasets are processed. - -Samplesheets are comma-separated text files with a header row specifying the column names, followed by one entry per row. For example, the samplesheet ([link](https://raw.githubusercontent.com/nf-core/test-datasets/viralrecon/samplesheet/samplesheet_test_illumina_amplicon.csv)) that we have been using during this teaching module looks like this: - -```csv title="samplesheet_test_illumina_amplicon.csv" -sample,fastq_1,fastq_2 -SAMPLE1_PE,https://raw.githubusercontent.com/nf-core/test-datasets/viralrecon/illumina/amplicon/sample1_R1.fastq.gz,https://raw.githubusercontent.com/nf-core/test-datasets/viralrecon/illumina/amplicon/sample1_R2.fastq.gz -SAMPLE2_PE,https://raw.githubusercontent.com/nf-core/test-datasets/viralrecon/illumina/amplicon/sample2_R1.fastq.gz,https://raw.githubusercontent.com/nf-core/test-datasets/viralrecon/illumina/amplicon/sample2_R2.fastq.gz -SAMPLE3_SE,https://raw.githubusercontent.com/nf-core/test-datasets/viralrecon/illumina/amplicon/sample1_R1.fastq.gz, -SAMPLE3_SE,https://raw.githubusercontent.com/nf-core/test-datasets/viralrecon/illumina/amplicon/sample2_R1.fastq.gz, -``` - -The structure of the samplesheet is specified in its own schema file in `assets/schema_input.json`. Each column has its own entry together with information about the column: - -```json title="assets/schema_input.json" -"properties": { - "sample": { - "type": "string", - "pattern": "^\\S+$", - "errorMessage": "Sample name must be provided and cannot contain spaces", - "meta": ["id"] - }, - "fastq_1": { - "type": "string", - "format": "file-path", - "exists": true, - "pattern": "^\\S+\\.f(ast)?q\\.gz$", - "errorMessage": "FastQ file for reads 1 must be provided, cannot contain spaces and must have extension '.fq.gz' or '.fastq.gz'" - }, - "fastq_2": { - "type": "string", - "format": "file-path", - "exists": true, - "pattern": "^\\S+\\.f(ast)?q\\.gz$", - "errorMessage": "FastQ file for reads 2 cannot contain spaces and must have extension '.fq.gz' or '.fastq.gz'" - } -}, -"required": ["sample", "fastq_1"] -``` - -This validates that the samplesheet has at least two columns: `sample` and `fastq1` (`"required": ["sample", "fastq_1"]`). It also checks that `fastq1` and `fastq2` are files, and that the file endings match a particular pattern. -Lastly, `sample` is information about the files that we want to attach and pass along the pipeline. nf-core uses `meta` maps for this: objects that have a key and a value. We can indicate this in the schema file directly by using the meta field: - -```json title="Sample column" - "sample": { - "type": "string", - "pattern": "^\\S+$", - "errorMessage": "Sample name must be provided and cannot contain spaces", - "meta": ["id"] - }, -``` - -This sets the key name as `id` and the value that is in the `sample` column, for example `SAMPLE1_PE`: - -```console title="meta map" -[id: SAMPLE1_PE] -``` - -By adding a new entry into the JSON schema, we can attach additional meta information that we want to track. This will automatically validate it for us and add it to the meta map. - -Let's add some new meta information, like the `sequencer` as an optional column: - -```json title="assets/schema_input.json" -"properties": { - "sample": { - "type": "string", - "pattern": "^\\S+$", - "errorMessage": "Sample name must be provided and cannot contain spaces", - "meta": ["id"] - }, - "sequencer": { - "type": "string", - "pattern": "^\\S+$", - "meta": ["sequencer"] - }, - "fastq_1": { - "type": "string", - "format": "file-path", - "exists": true, - "pattern": "^\\S+\\.f(ast)?q\\.gz$", - "errorMessage": "FastQ file for reads 1 must be provided, cannot contain spaces and must have extension '.fq.gz' or '.fastq.gz'" - }, - "fastq_2": { - "type": "string", - "format": "file-path", - "exists": true, - "pattern": "^\\S+\\.f(ast)?q\\.gz$", - "errorMessage": "FastQ file for reads 2 cannot contain spaces and must have extension '.fq.gz' or '.fastq.gz'" - } -}, -"required": ["sample", "fastq_1"] -``` - -We can now run our normal tests with the old samplesheet: - -```console -nextflow run . -profile docker,test --outdir results -``` - -The meta map now has a new key `sequencer`, that is empty because we did not specify a value yet: - -```console title="output" -[['id':'SAMPLE1_PE', 'sequencer':[], 'single_end':false], ... ] -[['id':'SAMPLE2_PE', 'sequencer':[], 'single_end':false], ... ] -[['id':'SAMPLE3_SE', 'sequencer':[], 'single_end':true], ... ] -``` - -We have also prepared a new samplesheet, that has the `sequencer` column. You can overwrite the existing input with this command: - -```console -nextflow run . -profile docker,test --outdir results --input ../../data/sequencer_samplesheet.csv -``` - -This populates the `sequencer` and we can see it in the pipeline, when `view`ing the samplesheet channel: - -```console title="output" -[['id':'SAMPLE1_PE', 'sequencer':'sequencer1', 'single_end':false], ... ] -[['id':'SAMPLE2_PE', 'sequencer':'sequencer2', 'single_end':false], ... ] -[['id':'SAMPLE3_SE', 'sequencer':'sequencer3', 'single_end':true], ... ] -``` - -We can comment the `ch_samplesheet.view()` line or remove it. We are not going to use it anymore in this training section. - -#### 2.6.1 Use the new meta key in the pipeline - -We can access this new meta value in the pipeline and use it to, for example, only enable trimming for samples from a particular sequencer. The [branch operator](https://www.nextflow.io/docs/stable/reference/operator.html#branch) let's us split -an input channel into several new output channels based on a selection criteria. Let's add this within the `if` block: - -```groovy title="workflows/myfirstpipeline.nf" linenums="31" - if (!params.skip_trim) { - - ch_seqtk_in = ch_samplesheet.branch { meta, reads -> - to_trim: meta["sequencer"] == "sequencer2" - other: true - } - - SEQTK_TRIM ( - ch_seqtk_in.to_trim - ) - ch_trimmed = SEQTK_TRIM.out.reads - ch_versions = ch_versions.mix(SEQTK_TRIM.out.versions.first()) - } -``` - -If we now rerun our default test, no reads are being trimmed (even though we did not specify `--skip_trim`): - -```console title="Output" -nextflow run . -profile docker,test --outdir results - -[- ] process > MYORG_MYFIRSTPIPELINE:MYFIRSTPIPELINE:SEQTK_TRIM - -[5a/f580bc] process > MYORG_MYFIRSTPIPELINE:MYFIRSTPIPELINE:MULTIQC [100%] 1 of 1 ✔ -``` - -If we use the samplesheet with the `sequencer` set, only one sample will be trimmed: - -```console -nextflow run . -profile docker,test --outdir results --input ../../data/sequencer_samplesheet.csv -resume -``` - -```console title="Output" -[47/fdf9de] process > MYORG_MYFIRSTPIPELINE:MYFIRSTPIPELINE:SEQTK_TRIM (SAMPLE2_PE) [100%] 1 of 1 ✔ -[2a/a742ae] process > MYORG_MYFIRSTPIPELINE:MYFIRSTPIPELINE:MULTIQC [100%] 1 of 1 ✔ -``` - -If you want to learn more about how to fine tune and develop the samplesheet schema further, visit [nf-schema](https://nextflow-io.github.io/nf-schema/2.2/nextflow_schema/sample_sheet_schema_specification/). - -#### Takeaway - -You explored how different samplesheets can provide different sets of additional information to your data files. You know how to adapt the samplesheet validation and how this is reflected in the pipeline in the `,meta` map. - -#### What's next? - -In the next step we will add a module that is not yet in nf-core. - ---- - -### 2.7 Create a custom module for your pipeline - -nf-core offers a comprehensive set of modules that have been created and curated by the community. However, as a developer, you may be interested in bespoke pieces of software that are not apart of the nf-core repository or customizing a module that already exists. - -In this instance, we will write a local module for the QC Tool [FastQE](https://fastqe.com/), which computes stats for FASTQ files and print those stats as emoji. - -This section should feel familiar to the `hello_modules` section. - -#### 2.7.1 Create the module - -!!! note "New module contributions are always welcome and encouraged!" - - If you have a module that you would like to contribute back to the community, reach out on the nf-core slack or open a pull request to the modules repository. - -Start by using the nf-core tooling to create a sceleton local module: - -```console -nf-core modules create -``` - -It will ask you to enter the tool name and some configurations for the module. We will use the defaults here: - -- Specify the tool name: `Name of tool/subtool: fastqe` -- Add the author name: `GitHub Username: (@):` -- Accept the defaults for the remaining prompts by typing `enter` - -This will create a new file in `modules/local/fastqe/main.nf` that already contains the container and conda definitions, the general structure of the process, and a number of TODO statements to guide you through the adaptation. - -!!! warning - - If the module already exists locally, the command will fail to prevent you from accidentally overwriting existing work: - - ```console - INFO Repository type: pipeline - INFO Press enter to use default values (shown in brackets) or type your own responses. ctrl+click underlined text to open links. - CRITICAL Module directory exists: 'modules/local/fastqe'. Use '--force' to overwrite - ``` - -Let's open the modules file: `modules/local/fastqe/main.nf`. - -You will notice, that it still calls `samtools` and the input are `bam`. - -From our sample sheet, we know we have fastq files instead, so let's change the input definition accordingly: - -```groovy title="modules/local/fastqe/main.nf" linenums="38" -tuple val(meta), path(reads) -``` - -The output of this tool is a tsv file with the emoji annotation, let's adapt the output as well: - -```groovy title="modules/local/fastqe/main.nf" linenums="42" -tuple val(meta), path("*.tsv"), emit: tsv -``` - -The script section still calls `samtools`. Let's change this to the proper call of the tool: - -```groovy title="modules/local/fastqe/main.nf" linenums="62" - fastqe \\ - $args \\ - $reads \\ - --output ${prefix}.tsv -``` - -And at last, we need to adapt the version retrieval. This tool does not have a version command, so we will add the release number manually: - -```groovy title="modules/local/fastqe/main.nf" linenums="52" - def VERSION = '0.3.3' -``` - -and write it to a file in the script section: - -```groovy title="modules/local/fastqe/main.nf" linenums="70" - fastqe: $VERSION -``` - -We will not cover [`stubs`](https://www.nextflow.io/docs/latest/process.html#stub) in this training. They are not necessary to run a module, so let's remove them for now: - -```groovy title="modules/local/fastqe/main.nf" linenums="74" -stub: - def args = task.ext.args ?: '' - def prefix = task.ext.prefix ?: "${meta.id}" - // TODO nf-core: A stub section should mimic the execution of the original module as best as possible - // Have a look at the following examples: - // Simple example: https://github.com/nf-core/modules/blob/818474a292b4860ae8ff88e149fbcda68814114d/modules/nf-core/bcftools/annotate/main.nf#L47-L63 - // Complex example: https://github.com/nf-core/modules/blob/818474a292b4860ae8ff88e149fbcda68814114d/modules/nf-core/bedtools/split/main.nf#L38-L54 - """ - touch ${prefix}.bam - - cat <<-END_VERSIONS > versions.yml - "${task.process}": - fastqe: \$(samtools --version |& sed '1!d ; s/samtools //') - END_VERSIONS - """ -``` - -If you think this looks a bit messy and just want to add a complete final version, here's one we made earlier and we've removed all the commented out instructions: - -```groovy title="modules/local/fastqe/main.nf" linenums="1" -process FASTQE { - tag "$meta.id" - label 'process_single' - - conda "${moduleDir}/environment.yml" - container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/fastqe:0.3.3--pyhdfd78af_0': - 'biocontainers/fastqe:0.3.3--pyhdfd78af_0' }" - - input: - tuple val(meta), path(reads) - - output: - tuple val(meta), path("*.tsv"), emit: tsv - path "versions.yml" , emit: versions - - when: - task.ext.when == null || task.ext.when - - script: - def args = task.ext.args ?: '' - def prefix = task.ext.prefix ?: "${meta.id}" - def VERSION = '0.3.3' - """ - fastqe \\ - $args \\ - $reads \\ - --output ${prefix}.tsv - - cat <<-END_VERSIONS > versions.yml - "${task.process}": - fastqe: $VERSION - END_VERSIONS - """ -} -``` - -#### 2.7.2 Include the module into the pipeline - -The module is now ready in your `modules/local` folder, but not yet included in your pipeline. Similar to `seqtk/trim` we need to add it to `workflows/myfirstpipeline.nf`: - -_Before:_ - -```groovy title="workflows/myfirstpipeline.nf" linenums="1" -/* -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - IMPORT MODULES / SUBWORKFLOWS / FUNCTIONS -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -*/ -include { SEQTK_TRIM } from '../modules/nf-core/seqtk/trim/main' -include { MULTIQC } from '../modules/nf-core/multiqc/main' -``` - -_After:_ - -```groovy title="workflows/myfirstpipeline.nf" linenums="1" -/* -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - IMPORT MODULES / SUBWORKFLOWS / FUNCTIONS -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -*/ -include { FASTQE } from '../modules/local/fastqe' -include { SEQTK_TRIM } from '../modules/nf-core/seqtk/trim/main' -include { MULTIQC } from '../modules/nf-core/multiqc/main' -``` - -and call it on our input data: - -```groovy title="workflows/myfirstpipeline.nf" linenums="45" - FASTQE(ch_samplesheet) - ch_versions = ch_versions.mix(FASTQE.out.versions.first()) -``` - -Let's run the pipeline again: - -```console -nextflow run . -profile docker,test --outdir results -``` - -In the results folder, you should now see a new subdirectory `fastqe/`, with the mean read qualities: - -```console title="SAMPLE1_PE.tsv" -Filename Statistic Qualities -sample1_R1.fastq.gz mean 😝 😝 😝 😝 😝 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😉 😉 😜 😜 😜 😉 😉 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😁 😉 😛 😜 😉 😉 😉 😉 😜 😜 😉 😉 😉 😉 😉 😁 😁 😁 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😜 😉 😉 😉 😉 😉 😜 😜 😜 😜 😜 😜 😜 😜 😜 😜 😜 😜 😜 😜 😛 😜 😜 😛 😛 😛 😚 -sample1_R2.fastq.gz mean 😌 😌 😌 😝 😝 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😜 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😉 😜 😉 😉 😜 😜 😉 😜 😜 😜 😜 😜 😜 😜 😜 😜 😜 😜 😜 😛 😜 😜 😜 😛 😜 😜 😜 😜 😛 😜 😛 😛 😛 😛 😛 😛 😛 😛 😛 😛 😛 😛 😝 😛 😝 😝 😝 😝 😝 😝 😝 😝 😝 😝 😝 😝 😝 😝 😌 😌 😌 😌 😌 😌 😌 😌 😌 😌 😌 😌 😌 😌 😌 😌 😌 😌 😌 😋 😋 😋 😋 😋 😋 😋 😋 😀 -``` - -#### Takeaway - -You added a new local module to the pipeline. We touched on how the module template files in nf-core look like and which aspects you need to adapt to add your own tool. - ---- - -## Summary - -In this side-quest you got an introduction to nf-core. You've learned: - -- Section 1: How to run nf-core pipelines - - 1. Where to find information about nf-core pipelines - 2. How to run a nf-core pipelines - -- Section 2: How to create an nf-core pipelines: - - 1. About nf-core tooling - 2. About the nf-core template: - - - How to create a basic nf-core pipeline - - What files are in the template - - 3. About nf-core/modules: - - - How to find one - - How to install it - - How to configure it in the `modules.config` - - 4. About parameters: - - - Where to add it in the workflow code - - How to set a default in the `nextflow.config` - - How to validated the parameter using the `nextflow_schema.json` - - 5. About `meta` maps: - - - What a `meta` map is - - How to access information from it - - How to add new fields in the `assets/schema_input.json` - - How to add a column in the samplesheet to track additional `meta` information - - 6. About developping a local module: - - - How to create a module sceleton file using nf-core tooling - - How to adapt the sceleton file - - How to include the module in the pipeline - -### What's next? - -Check out the [nf-core documentation](https://nf-co.re) to learn more. You can join the [nf-core community slack](https://nf-co.re/join#slack) where most of the exchange happens. You might want to: - -- Get involved in the development of an nf-core pipeline -- Contribute nf-core components -- Contribute a pipeline to nf-core (before you do, check their [guidelines](https://nf-co.re/docs/guidelines/pipelines/overview#ask-the-community)) -- Start developping your own nf-core style pipeline diff --git a/hello-nf-core/solutions/core-hello-part2/.nf-core.yml b/hello-nf-core/solutions/core-hello-part2/.nf-core.yml new file mode 100644 index 0000000000..1a638d8288 --- /dev/null +++ b/hello-nf-core/solutions/core-hello-part2/.nf-core.yml @@ -0,0 +1,106 @@ +repository_type: pipeline + +nf_core_version: 3.4.1 + +lint: + files_unchanged: + - .github/CONTRIBUTING.md + - .prettierignore + - .prettierignore + - .prettierignore + - CODE_OF_CONDUCT.md + - assets/nf-core-hello_logo_light.png + - docs/images/nf-core-hello_logo_light.png + - docs/images/nf-core-hello_logo_dark.png + - .github/ISSUE_TEMPLATE/bug_report.yml + - .github/CONTRIBUTING.md + - .github/PULL_REQUEST_TEMPLATE.md + - assets/email_template.txt + - docs/README.md + - .github/ISSUE_TEMPLATE/bug_report.yml + - .github/ISSUE_TEMPLATE/config.yml + - .github/ISSUE_TEMPLATE/feature_request.yml + - .github/PULL_REQUEST_TEMPLATE.md + - .github/workflows/branch.yml + - .github/workflows/linting_comment.yml + - .github/workflows/linting.yml + - .github/CONTRIBUTING.md + - .github/.dockstore.yml + - .github/CONTRIBUTING.md + - assets/sendmail_template.txt + - .prettierignore + - LICENSE + nextflow_config: + - manifest.name + - manifest.homePage + nf_test_content: false + multiqc_config: false + files_exist: + - .github/workflows/nf-test.yml + - .github/actions/get-shards/action.yml + - .github/actions/nf-test/action.yml + - nf-test.config + - tests/default.nf.test + - assets/email_template.html + - assets/sendmail_template.txt + - assets/email_template.txt + - CODE_OF_CONDUCT.md + - assets/nf-core-hello_logo_light.png + - docs/images/nf-core-hello_logo_light.png + - docs/images/nf-core-hello_logo_dark.png + - .github/ISSUE_TEMPLATE/config.yml + - .github/workflows/awstest.yml + - .github/workflows/awsfulltest.yml + - .github/ISSUE_TEMPLATE/bug_report.yml + - .github/ISSUE_TEMPLATE/feature_request.yml + - .github/PULL_REQUEST_TEMPLATE.md + - .github/CONTRIBUTING.md + - .github/.dockstore.yml + - CHANGELOG.md + - assets/multiqc_config.yml + - .github/workflows/branch.yml + - .github/workflows/nf-test.yml + - .github/actions/get-shards/action.yml + - .github/actions/nf-test/action.yml + - .github/workflows/linting_comment.yml + - .github/workflows/linting.yml + - .prettierignore + - .prettierrc.yml + - conf/igenomes.config + - conf/igenomes_ignored.config + - CITATIONS.md + - LICENSE + readme: + - nextflow_badge + - nextflow_badge + - nfcore_template_badge + +template: + org: core + name: hello + description: A basic nf-core style version of Hello Nextflow + author: pinin4fjords + version: 1.0.0dev + force: true + outdir: . + skip_features: + - github + - github_badges + - changelog + - license + - ci + - nf-test + - igenomes + - multiqc + - fastqc + - seqera_platform + - gpu + - codespaces + - vscode + - code_linters + - citations + - rocrate + - email + - adaptivecard + - slackreport + is_nfcore: false diff --git a/hello-nf-core/solutions/core-hello-part2/README.md b/hello-nf-core/solutions/core-hello-part2/README.md index 0a533c4c4b..94844f6c5e 100644 --- a/hello-nf-core/solutions/core-hello-part2/README.md +++ b/hello-nf-core/solutions/core-hello-part2/README.md @@ -11,7 +11,7 @@ --> + workflows use the "tube map" design for that. See https://nf-co.re/docs/guidelines/graphic_design/workflow_diagrams#examples for examples. --> ## Usage @@ -51,7 +51,7 @@ nextflow run core/hello \ ## Credits -core/hello was originally written by GG. +core/hello was originally written by pinin4fjords. We thank the following people for their extensive assistance in the development of this pipeline: diff --git a/hello-nf-core/solutions/core-hello-part2/assets/schema_input.json b/hello-nf-core/solutions/core-hello-part2/assets/schema_input.json index bc0261f329..5cb7458161 100644 --- a/hello-nf-core/solutions/core-hello-part2/assets/schema_input.json +++ b/hello-nf-core/solutions/core-hello-part2/assets/schema_input.json @@ -17,14 +17,14 @@ "type": "string", "format": "file-path", "exists": true, - "pattern": "^\\S+\\.f(ast)?q\\.gz$", + "pattern": "^([\\S\\s]*\\/)?[^\\s\\/]+\\.f(ast)?q\\.gz$", "errorMessage": "FastQ file for reads 1 must be provided, cannot contain spaces and must have extension '.fq.gz' or '.fastq.gz'" }, "fastq_2": { "type": "string", "format": "file-path", "exists": true, - "pattern": "^\\S+\\.f(ast)?q\\.gz$", + "pattern": "^([\\S\\s]*\\/)?[^\\s\\/]+\\.f(ast)?q\\.gz$", "errorMessage": "FastQ file for reads 2 cannot contain spaces and must have extension '.fq.gz' or '.fastq.gz'" } }, diff --git a/hello-nf-core/solutions/core-hello-part2/conf/base.config b/hello-nf-core/solutions/core-hello-part2/conf/base.config index 1abcd9876f..e0fe40762f 100644 --- a/hello-nf-core/solutions/core-hello-part2/conf/base.config +++ b/hello-nf-core/solutions/core-hello-part2/conf/base.config @@ -15,7 +15,7 @@ process { memory = { 6.GB * task.attempt } time = { 4.h * task.attempt } - errorStrategy = { task.exitStatus in ((130..145) + 104) ? 'retry' : 'finish' } + errorStrategy = { task.exitStatus in ((130..145) + 104 + 175) ? 'retry' : 'finish' } maxRetries = 1 maxErrors = '-1' @@ -59,4 +59,8 @@ process { errorStrategy = 'retry' maxRetries = 2 } + withLabel: process_gpu { + ext.use_gpu = { workflow.profile.contains('gpu') } + accelerator = { workflow.profile.contains('gpu') ? 1 : null } + } } diff --git a/hello-nf-core/solutions/core-hello-part2/conf/test.config b/hello-nf-core/solutions/core-hello-part2/conf/test.config index f82761298d..13ecf2ad4b 100644 --- a/hello-nf-core/solutions/core-hello-part2/conf/test.config +++ b/hello-nf-core/solutions/core-hello-part2/conf/test.config @@ -12,8 +12,9 @@ process { resourceLimits = [ - cpus: 1, - memory: '1.GB' + cpus: 2, + memory: '4.GB', + time: '1.h' ] } diff --git a/hello-nf-core/solutions/core-hello-part2/docs/usage.md b/hello-nf-core/solutions/core-hello-part2/docs/usage.md index 78b55f9afe..bfbc37ab42 100644 --- a/hello-nf-core/solutions/core-hello-part2/docs/usage.md +++ b/hello-nf-core/solutions/core-hello-part2/docs/usage.md @@ -146,7 +146,7 @@ If `-profile` is not specified, the pipeline will run locally and expect all sof - `shifter` - A generic configuration profile to be used with [Shifter](https://nersc.gitlab.io/development/shifter/how-to-use/) - `charliecloud` - - A generic configuration profile to be used with [Charliecloud](https://hpc.github.io/charliecloud/) + - A generic configuration profile to be used with [Charliecloud](https://charliecloud.io/) - `apptainer` - A generic configuration profile to be used with [Apptainer](https://apptainer.org/) - `wave` diff --git a/hello-nf-core/solutions/core-hello-part2/main.nf b/hello-nf-core/solutions/core-hello-part2/main.nf index f72a236660..eb8d91361f 100644 --- a/hello-nf-core/solutions/core-hello-part2/main.nf +++ b/hello-nf-core/solutions/core-hello-part2/main.nf @@ -57,7 +57,10 @@ workflow { params.monochrome_logs, args, params.outdir, - params.input + params.input, + params.help, + params.help_full, + params.show_hidden ) // diff --git a/hello-nf-core/solutions/core-hello-part2/modules.json b/hello-nf-core/solutions/core-hello-part2/modules.json index e36947ce00..8ca44d9da5 100644 --- a/hello-nf-core/solutions/core-hello-part2/modules.json +++ b/hello-nf-core/solutions/core-hello-part2/modules.json @@ -10,17 +10,17 @@ "nf-core": { "utils_nextflow_pipeline": { "branch": "master", - "git_sha": "c2b22d85f30a706a3073387f30380704fcae013b", + "git_sha": "05954dab2ff481bcb999f24455da29a5828af08d", "installed_by": ["subworkflows"] }, "utils_nfcore_pipeline": { "branch": "master", - "git_sha": "51ae5406a030d4da1e49e4dab49756844fdd6c7a", + "git_sha": "05954dab2ff481bcb999f24455da29a5828af08d", "installed_by": ["subworkflows"] }, "utils_nfschema_plugin": { "branch": "master", - "git_sha": "2fd2cd6d0e7b273747f32e465fdc6bcc3ae0814e", + "git_sha": "4b406a74dc0449c0401ed87d5bfff4252fd277fd", "installed_by": ["subworkflows"] } } diff --git a/hello-nf-core/solutions/core-hello-part2/modules/local/collectGreetings.nf b/hello-nf-core/solutions/core-hello-part2/modules/local/collectGreetings.nf index 0274fec86f..849bba4b6e 100644 --- a/hello-nf-core/solutions/core-hello-part2/modules/local/collectGreetings.nf +++ b/hello-nf-core/solutions/core-hello-part2/modules/local/collectGreetings.nf @@ -11,8 +11,10 @@ process collectGreetings { output: path "COLLECTED-${batch_name}-output.txt" , emit: outfile + val count_greetings , emit: count script: + count_greetings = input_files.size() """ cat ${input_files} > 'COLLECTED-${batch_name}-output.txt' """ diff --git a/hello-nf-core/solutions/core-hello-part2/nextflow.config b/hello-nf-core/solutions/core-hello-part2/nextflow.config index d633adb989..b59ba2175d 100644 --- a/hello-nf-core/solutions/core-hello-part2/nextflow.config +++ b/hello-nf-core/solutions/core-hello-part2/nextflow.config @@ -22,7 +22,9 @@ params { show_hidden = false version = false pipelines_testdata_base_path = 'https://raw.githubusercontent.com/nf-core/test-datasets/' - trace_report_suffix = new java.util.Date().format( 'yyyy-MM-dd_HH-mm-ss')// Config options + trace_report_suffix = new java.util.Date().format( 'yyyy-MM-dd_HH-mm-ss') + + // Config options config_profile_name = null config_profile_description = null @@ -75,7 +77,18 @@ profiles { apptainer.enabled = false docker.runOptions = '-u $(id -u):$(id -g)' } - arm { + arm64 { + process.arch = 'arm64' + // TODO https://github.com/nf-core/modules/issues/6694 + // For now if you're using arm64 you have to use wave for the sake of the maintainers + // wave profile + apptainer.ociAutoPull = true + singularity.ociAutoPull = true + wave.enabled = true + wave.freeze = true + wave.strategy = 'conda,container' + } + emulate_amd64 { docker.runOptions = '-u $(id -u):$(id -g) --platform=linux/amd64' } singularity { @@ -132,16 +145,24 @@ profiles { wave.freeze = true wave.strategy = 'conda,container' } + gpu { + docker.runOptions = '-u $(id -u):$(id -g) --gpus all' + apptainer.runOptions = '--nv' + singularity.runOptions = '--nv' + } test { includeConfig 'conf/test.config' } test_full { includeConfig 'conf/test_full.config' } } +// Load nf-core custom profiles from different institutions + +// If params.custom_config_base is set AND either the NXF_OFFLINE environment variable is not set or params.custom_config_base is a local path, the nfcore_custom.config file from the specified base path is included. +// Load core/hello custom profiles from different institutions. +includeConfig params.custom_config_base && (!System.getenv('NXF_OFFLINE') || !params.custom_config_base.startsWith('http')) ? "${params.custom_config_base}/nfcore_custom.config" : "/dev/null" -// Load nf-core custom profiles from different Institutions -includeConfig !System.getenv('NXF_OFFLINE') && params.custom_config_base ? "${params.custom_config_base}/nfcore_custom.config" : "/dev/null" // Load core/hello custom profiles from different institutions. // TODO nf-core: Optionally, you can add a pipeline-specific nf-core config at https://github.com/nf-core/configs -// includeConfig !System.getenv('NXF_OFFLINE') && params.custom_config_base ? "${params.custom_config_base}/pipeline/hello.config" : "/dev/null" +// includeConfig params.custom_config_base && (!System.getenv('NXF_OFFLINE') || !params.custom_config_base.startsWith('http')) ? "${params.custom_config_base}/pipeline/hello.config" : "/dev/null" // Set default registry for Apptainer, Docker, Podman, Charliecloud and Singularity independent of -profile // Will not be used unless Apptainer / Docker / Podman / Charliecloud / Singularity are enabled @@ -197,11 +218,10 @@ dag { manifest { name = 'core/hello' - author = """GG""" // The author field is deprecated from Nextflow version 24.10.0, use contributors instead contributors = [ // TODO nf-core: Update the field with the details of the contributors to your pipeline. New with Nextflow version 24.10.0 [ - name: 'GG', + name: 'pinin4fjords', affiliation: '', email: '', github: '', @@ -210,28 +230,22 @@ manifest { ], ] homePage = 'https://github.com/core/hello' - description = """basic nf-core style version of Hello Nextflow""" + description = """A basic nf-core style version of Hello Nextflow""" mainScript = 'main.nf' defaultBranch = 'main' - nextflowVersion = '!>=24.04.2' + nextflowVersion = '!>=25.04.0' version = '1.0.0dev' doi = '' } // Nextflow plugins plugins { - id 'nf-schema@2.2.0' // Validation of pipeline parameters and creation of an input channel from a sample sheet + id 'nf-schema@2.5.1' // Validation of pipeline parameters and creation of an input channel from a sample sheet } validation { defaultIgnoreParams = ["genomes"] monochromeLogs = params.monochrome_logs - help { - enabled = true - command = "nextflow run core/hello -profile --input samplesheet.csv --outdir " - fullParameter = "help_full" - showHiddenParameter = "show_hidden" - } } // Load modules.config for DSL2 module specific options diff --git a/hello-nf-core/solutions/core-hello-part2/nextflow_schema.json b/hello-nf-core/solutions/core-hello-part2/nextflow_schema.json index 5ee5ec357f..fc18ba7998 100644 --- a/hello-nf-core/solutions/core-hello-part2/nextflow_schema.json +++ b/hello-nf-core/solutions/core-hello-part2/nextflow_schema.json @@ -2,7 +2,7 @@ "$schema": "https://json-schema.org/draft/2020-12/schema", "$id": "https://raw.githubusercontent.com/core/hello/main/nextflow_schema.json", "title": "core/hello pipeline parameters", - "description": "basic nf-core style version of Hello Nextflow", + "description": "A basic nf-core style version of Hello Nextflow", "type": "object", "$defs": { "input_output_options": { @@ -133,6 +133,18 @@ "fa_icon": "far calendar", "description": "Suffix to add to the trace report filename. Default is the date and time in the format yyyy-MM-dd_HH-mm-ss.", "hidden": true + }, + "help": { + "type": ["boolean", "string"], + "description": "Display the help message." + }, + "help_full": { + "type": "boolean", + "description": "Display the full detailed help message." + }, + "show_hidden": { + "type": "boolean", + "description": "Display hidden parameters in the help message (only works when --help or --help_full are provided)." } } } diff --git a/hello-nf-core/solutions/core-hello-part2/subworkflows/local/utils_nfcore_hello_pipeline/main.nf b/hello-nf-core/solutions/core-hello-part2/subworkflows/local/utils_nfcore_hello_pipeline/main.nf index 53ba38fae8..93c9f874cc 100644 --- a/hello-nf-core/solutions/core-hello-part2/subworkflows/local/utils_nfcore_hello_pipeline/main.nf +++ b/hello-nf-core/solutions/core-hello-part2/subworkflows/local/utils_nfcore_hello_pipeline/main.nf @@ -11,6 +11,7 @@ include { UTILS_NFSCHEMA_PLUGIN } from '../../nf-core/utils_nfschema_plugin' include { paramsSummaryMap } from 'plugin/nf-schema' include { samplesheetToList } from 'plugin/nf-schema' +include { paramsHelp } from 'plugin/nf-schema' include { completionSummary } from '../../nf-core/utils_nfcore_pipeline' include { UTILS_NFCORE_PIPELINE } from '../../nf-core/utils_nfcore_pipeline' include { UTILS_NEXTFLOW_PIPELINE } from '../../nf-core/utils_nextflow_pipeline' @@ -30,6 +31,9 @@ workflow PIPELINE_INITIALISATION { nextflow_cli_args // array: List of positional nextflow CLI args outdir // string: The output directory where the results will be saved input // string: Path to input samplesheet + help // boolean: Display help message and exit + help_full // boolean: Show the full help message + show_hidden // boolean: Show hidden parameters in the help message main: @@ -48,10 +52,18 @@ workflow PIPELINE_INITIALISATION { // // Validate parameters and generate parameter summary to stdout // + command = "nextflow run ${workflow.manifest.name} -profile --input samplesheet.csv --outdir " + UTILS_NFSCHEMA_PLUGIN ( workflow, validate_params, - null + null, + help, + help_full, + show_hidden, + "", + "", + command ) // @@ -64,9 +76,10 @@ workflow PIPELINE_INITIALISATION { // // Create channel from input file provided through params.input // - ch_samplesheet = Channel.fromPath(params.input) - .splitCsv() - .map { line -> line[0] } + + ch_samplesheet = channel.fromPath(params.input) + .splitCsv() + .map { line -> line[0] } emit: samplesheet = ch_samplesheet diff --git a/hello-nf-core/solutions/core-hello-part2/subworkflows/nf-core/utils_nextflow_pipeline/tests/tags.yml b/hello-nf-core/solutions/core-hello-part2/subworkflows/nf-core/utils_nextflow_pipeline/tests/tags.yml deleted file mode 100644 index f84761125a..0000000000 --- a/hello-nf-core/solutions/core-hello-part2/subworkflows/nf-core/utils_nextflow_pipeline/tests/tags.yml +++ /dev/null @@ -1,2 +0,0 @@ -subworkflows/utils_nextflow_pipeline: - - subworkflows/nf-core/utils_nextflow_pipeline/** diff --git a/hello-nf-core/solutions/core-hello-part2/subworkflows/nf-core/utils_nfcore_pipeline/tests/tags.yml b/hello-nf-core/solutions/core-hello-part2/subworkflows/nf-core/utils_nfcore_pipeline/tests/tags.yml deleted file mode 100644 index ac8523c9a2..0000000000 --- a/hello-nf-core/solutions/core-hello-part2/subworkflows/nf-core/utils_nfcore_pipeline/tests/tags.yml +++ /dev/null @@ -1,2 +0,0 @@ -subworkflows/utils_nfcore_pipeline: - - subworkflows/nf-core/utils_nfcore_pipeline/** diff --git a/hello-nf-core/solutions/core-hello-part2/subworkflows/nf-core/utils_nfschema_plugin/main.nf b/hello-nf-core/solutions/core-hello-part2/subworkflows/nf-core/utils_nfschema_plugin/main.nf index 93de2a5245..acb3972419 100644 --- a/hello-nf-core/solutions/core-hello-part2/subworkflows/nf-core/utils_nfschema_plugin/main.nf +++ b/hello-nf-core/solutions/core-hello-part2/subworkflows/nf-core/utils_nfschema_plugin/main.nf @@ -4,6 +4,7 @@ include { paramsSummaryLog } from 'plugin/nf-schema' include { validateParameters } from 'plugin/nf-schema' +include { paramsHelp } from 'plugin/nf-schema' workflow UTILS_NFSCHEMA_PLUGIN { @@ -15,29 +16,56 @@ workflow UTILS_NFSCHEMA_PLUGIN { // when this input is empty it will automatically use the configured schema or // "${projectDir}/nextflow_schema.json" as default. This input should not be empty // for meta pipelines + help // boolean: show help message + help_full // boolean: show full help message + show_hidden // boolean: show hidden parameters in help message + before_text // string: text to show before the help message and parameters summary + after_text // string: text to show after the help message and parameters summary + command // string: an example command of the pipeline main: + if(help || help_full) { + help_options = [ + beforeText: before_text, + afterText: after_text, + command: command, + showHidden: show_hidden, + fullHelp: help_full, + ] + if(parameters_schema) { + help_options << [parametersSchema: parameters_schema] + } + log.info paramsHelp( + help_options, + params.help instanceof String ? params.help : "", + ) + exit 0 + } + // // Print parameter summary to stdout. This will display the parameters // that differ from the default given in the JSON schema // + + summary_options = [:] if(parameters_schema) { - log.info paramsSummaryLog(input_workflow, parameters_schema:parameters_schema) - } else { - log.info paramsSummaryLog(input_workflow) + summary_options << [parametersSchema: parameters_schema] } + log.info before_text + log.info paramsSummaryLog(summary_options, input_workflow) + log.info after_text // // Validate the parameters using nextflow_schema.json or the schema // given via the validation.parametersSchema configuration option // if(validate_params) { + validateOptions = [:] if(parameters_schema) { - validateParameters(parameters_schema:parameters_schema) - } else { - validateParameters() + validateOptions << [parametersSchema: parameters_schema] } + validateParameters(validateOptions) } emit: diff --git a/hello-nf-core/solutions/core-hello-part2/subworkflows/nf-core/utils_nfschema_plugin/tests/main.nf.test b/hello-nf-core/solutions/core-hello-part2/subworkflows/nf-core/utils_nfschema_plugin/tests/main.nf.test index 8fb3016487..c977917aac 100644 --- a/hello-nf-core/solutions/core-hello-part2/subworkflows/nf-core/utils_nfschema_plugin/tests/main.nf.test +++ b/hello-nf-core/solutions/core-hello-part2/subworkflows/nf-core/utils_nfschema_plugin/tests/main.nf.test @@ -25,6 +25,12 @@ nextflow_workflow { input[0] = workflow input[1] = validate_params input[2] = "" + input[3] = false + input[4] = false + input[5] = false + input[6] = "" + input[7] = "" + input[8] = "" """ } } @@ -51,6 +57,12 @@ nextflow_workflow { input[0] = workflow input[1] = validate_params input[2] = "" + input[3] = false + input[4] = false + input[5] = false + input[6] = "" + input[7] = "" + input[8] = "" """ } } @@ -77,6 +89,12 @@ nextflow_workflow { input[0] = workflow input[1] = validate_params input[2] = "${projectDir}/subworkflows/nf-core/utils_nfschema_plugin/tests/nextflow_schema.json" + input[3] = false + input[4] = false + input[5] = false + input[6] = "" + input[7] = "" + input[8] = "" """ } } @@ -103,6 +121,12 @@ nextflow_workflow { input[0] = workflow input[1] = validate_params input[2] = "${projectDir}/subworkflows/nf-core/utils_nfschema_plugin/tests/nextflow_schema.json" + input[3] = false + input[4] = false + input[5] = false + input[6] = "" + input[7] = "" + input[8] = "" """ } } @@ -114,4 +138,36 @@ nextflow_workflow { ) } } + + test("Should create a help message") { + + when { + + params { + test_data = '' + outdir = null + } + + workflow { + """ + validate_params = true + input[0] = workflow + input[1] = validate_params + input[2] = "${projectDir}/subworkflows/nf-core/utils_nfschema_plugin/tests/nextflow_schema.json" + input[3] = true + input[4] = false + input[5] = false + input[6] = "Before" + input[7] = "After" + input[8] = "nextflow run test/test" + """ + } + } + + then { + assertAll( + { assert workflow.success } + ) + } + } } diff --git a/hello-nf-core/solutions/core-hello-part2/subworkflows/nf-core/utils_nfschema_plugin/tests/nextflow.config b/hello-nf-core/solutions/core-hello-part2/subworkflows/nf-core/utils_nfschema_plugin/tests/nextflow.config index 478fb8a05f..8d8c73718a 100644 --- a/hello-nf-core/solutions/core-hello-part2/subworkflows/nf-core/utils_nfschema_plugin/tests/nextflow.config +++ b/hello-nf-core/solutions/core-hello-part2/subworkflows/nf-core/utils_nfschema_plugin/tests/nextflow.config @@ -1,5 +1,5 @@ plugins { - id "nf-schema@2.1.0" + id "nf-schema@2.5.1" } validation { diff --git a/hello-nf-core/solutions/core-hello-part2/workflows/hello.nf b/hello-nf-core/solutions/core-hello-part2/workflows/hello.nf index 3e3af1c2a7..7810b24ae6 100644 --- a/hello-nf-core/solutions/core-hello-part2/workflows/hello.nf +++ b/hello-nf-core/solutions/core-hello-part2/workflows/hello.nf @@ -20,8 +20,11 @@ workflow HELLO { take: ch_samplesheet // channel: samplesheet read in from --input + main: + ch_versions = Channel.empty() + // emit a greeting sayHello(ch_samplesheet) @@ -34,8 +37,6 @@ workflow HELLO { // generate ASCII art of the greetings with cowpy cowpy(collectGreetings.out.outfile, params.character) - ch_versions = Channel.empty() - // // Collate and save software versions // diff --git a/hello-nf-core/solutions/core-hello-part3/.nf-core.yml b/hello-nf-core/solutions/core-hello-part3/.nf-core.yml new file mode 100644 index 0000000000..1a638d8288 --- /dev/null +++ b/hello-nf-core/solutions/core-hello-part3/.nf-core.yml @@ -0,0 +1,106 @@ +repository_type: pipeline + +nf_core_version: 3.4.1 + +lint: + files_unchanged: + - .github/CONTRIBUTING.md + - .prettierignore + - .prettierignore + - .prettierignore + - CODE_OF_CONDUCT.md + - assets/nf-core-hello_logo_light.png + - docs/images/nf-core-hello_logo_light.png + - docs/images/nf-core-hello_logo_dark.png + - .github/ISSUE_TEMPLATE/bug_report.yml + - .github/CONTRIBUTING.md + - .github/PULL_REQUEST_TEMPLATE.md + - assets/email_template.txt + - docs/README.md + - .github/ISSUE_TEMPLATE/bug_report.yml + - .github/ISSUE_TEMPLATE/config.yml + - .github/ISSUE_TEMPLATE/feature_request.yml + - .github/PULL_REQUEST_TEMPLATE.md + - .github/workflows/branch.yml + - .github/workflows/linting_comment.yml + - .github/workflows/linting.yml + - .github/CONTRIBUTING.md + - .github/.dockstore.yml + - .github/CONTRIBUTING.md + - assets/sendmail_template.txt + - .prettierignore + - LICENSE + nextflow_config: + - manifest.name + - manifest.homePage + nf_test_content: false + multiqc_config: false + files_exist: + - .github/workflows/nf-test.yml + - .github/actions/get-shards/action.yml + - .github/actions/nf-test/action.yml + - nf-test.config + - tests/default.nf.test + - assets/email_template.html + - assets/sendmail_template.txt + - assets/email_template.txt + - CODE_OF_CONDUCT.md + - assets/nf-core-hello_logo_light.png + - docs/images/nf-core-hello_logo_light.png + - docs/images/nf-core-hello_logo_dark.png + - .github/ISSUE_TEMPLATE/config.yml + - .github/workflows/awstest.yml + - .github/workflows/awsfulltest.yml + - .github/ISSUE_TEMPLATE/bug_report.yml + - .github/ISSUE_TEMPLATE/feature_request.yml + - .github/PULL_REQUEST_TEMPLATE.md + - .github/CONTRIBUTING.md + - .github/.dockstore.yml + - CHANGELOG.md + - assets/multiqc_config.yml + - .github/workflows/branch.yml + - .github/workflows/nf-test.yml + - .github/actions/get-shards/action.yml + - .github/actions/nf-test/action.yml + - .github/workflows/linting_comment.yml + - .github/workflows/linting.yml + - .prettierignore + - .prettierrc.yml + - conf/igenomes.config + - conf/igenomes_ignored.config + - CITATIONS.md + - LICENSE + readme: + - nextflow_badge + - nextflow_badge + - nfcore_template_badge + +template: + org: core + name: hello + description: A basic nf-core style version of Hello Nextflow + author: pinin4fjords + version: 1.0.0dev + force: true + outdir: . + skip_features: + - github + - github_badges + - changelog + - license + - ci + - nf-test + - igenomes + - multiqc + - fastqc + - seqera_platform + - gpu + - codespaces + - vscode + - code_linters + - citations + - rocrate + - email + - adaptivecard + - slackreport + is_nfcore: false diff --git a/hello-nf-core/solutions/core-hello-part3/README.md b/hello-nf-core/solutions/core-hello-part3/README.md index 0a533c4c4b..94844f6c5e 100644 --- a/hello-nf-core/solutions/core-hello-part3/README.md +++ b/hello-nf-core/solutions/core-hello-part3/README.md @@ -11,7 +11,7 @@ --> + workflows use the "tube map" design for that. See https://nf-co.re/docs/guidelines/graphic_design/workflow_diagrams#examples for examples. --> ## Usage @@ -51,7 +51,7 @@ nextflow run core/hello \ ## Credits -core/hello was originally written by GG. +core/hello was originally written by pinin4fjords. We thank the following people for their extensive assistance in the development of this pipeline: diff --git a/hello-nf-core/solutions/core-hello-part3/assets/schema_input.json b/hello-nf-core/solutions/core-hello-part3/assets/schema_input.json index bc0261f329..5cb7458161 100644 --- a/hello-nf-core/solutions/core-hello-part3/assets/schema_input.json +++ b/hello-nf-core/solutions/core-hello-part3/assets/schema_input.json @@ -17,14 +17,14 @@ "type": "string", "format": "file-path", "exists": true, - "pattern": "^\\S+\\.f(ast)?q\\.gz$", + "pattern": "^([\\S\\s]*\\/)?[^\\s\\/]+\\.f(ast)?q\\.gz$", "errorMessage": "FastQ file for reads 1 must be provided, cannot contain spaces and must have extension '.fq.gz' or '.fastq.gz'" }, "fastq_2": { "type": "string", "format": "file-path", "exists": true, - "pattern": "^\\S+\\.f(ast)?q\\.gz$", + "pattern": "^([\\S\\s]*\\/)?[^\\s\\/]+\\.f(ast)?q\\.gz$", "errorMessage": "FastQ file for reads 2 cannot contain spaces and must have extension '.fq.gz' or '.fastq.gz'" } }, diff --git a/hello-nf-core/solutions/core-hello-part3/conf/base.config b/hello-nf-core/solutions/core-hello-part3/conf/base.config index 1abcd9876f..e0fe40762f 100644 --- a/hello-nf-core/solutions/core-hello-part3/conf/base.config +++ b/hello-nf-core/solutions/core-hello-part3/conf/base.config @@ -15,7 +15,7 @@ process { memory = { 6.GB * task.attempt } time = { 4.h * task.attempt } - errorStrategy = { task.exitStatus in ((130..145) + 104) ? 'retry' : 'finish' } + errorStrategy = { task.exitStatus in ((130..145) + 104 + 175) ? 'retry' : 'finish' } maxRetries = 1 maxErrors = '-1' @@ -59,4 +59,8 @@ process { errorStrategy = 'retry' maxRetries = 2 } + withLabel: process_gpu { + ext.use_gpu = { workflow.profile.contains('gpu') } + accelerator = { workflow.profile.contains('gpu') ? 1 : null } + } } diff --git a/hello-nf-core/solutions/core-hello-part3/conf/test.config b/hello-nf-core/solutions/core-hello-part3/conf/test.config index f82761298d..13ecf2ad4b 100644 --- a/hello-nf-core/solutions/core-hello-part3/conf/test.config +++ b/hello-nf-core/solutions/core-hello-part3/conf/test.config @@ -12,8 +12,9 @@ process { resourceLimits = [ - cpus: 1, - memory: '1.GB' + cpus: 2, + memory: '4.GB', + time: '1.h' ] } diff --git a/hello-nf-core/solutions/core-hello-part3/conf/test_full.config b/hello-nf-core/solutions/core-hello-part3/conf/test_full.config new file mode 100644 index 0000000000..ceeaf40cac --- /dev/null +++ b/hello-nf-core/solutions/core-hello-part3/conf/test_full.config @@ -0,0 +1,24 @@ +/* +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + Nextflow config file for running full-size tests +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + Defines input files and everything required to run a full size pipeline test. + + Use as follows: + nextflow run core/hello -profile test_full, --outdir + +---------------------------------------------------------------------------------------- +*/ + +params { + config_profile_name = 'Full test profile' + config_profile_description = 'Full test dataset to check pipeline function' + + // Input data for full size test + // TODO nf-core: Specify the paths to your full test data ( on nf-core/test-datasets or directly in repositories, e.g. SRA) + // TODO nf-core: Give any required params for the test so that command line flags are not needed + input = params.pipelines_testdata_base_path + 'viralrecon/samplesheet/samplesheet_full_illumina_amplicon.csv' + + // Fasta references + fasta = params.pipelines_testdata_base_path + 'viralrecon/genome/NC_045512.2/GCF_009858895.2_ASM985889v3_genomic.200409.fna.gz' +} diff --git a/hello-nf-core/solutions/core-hello-part3/docs/usage.md b/hello-nf-core/solutions/core-hello-part3/docs/usage.md index 78b55f9afe..bfbc37ab42 100644 --- a/hello-nf-core/solutions/core-hello-part3/docs/usage.md +++ b/hello-nf-core/solutions/core-hello-part3/docs/usage.md @@ -146,7 +146,7 @@ If `-profile` is not specified, the pipeline will run locally and expect all sof - `shifter` - A generic configuration profile to be used with [Shifter](https://nersc.gitlab.io/development/shifter/how-to-use/) - `charliecloud` - - A generic configuration profile to be used with [Charliecloud](https://hpc.github.io/charliecloud/) + - A generic configuration profile to be used with [Charliecloud](https://charliecloud.io/) - `apptainer` - A generic configuration profile to be used with [Apptainer](https://apptainer.org/) - `wave` diff --git a/hello-nf-core/solutions/core-hello-part3/main.nf b/hello-nf-core/solutions/core-hello-part3/main.nf index f72a236660..eb8d91361f 100644 --- a/hello-nf-core/solutions/core-hello-part3/main.nf +++ b/hello-nf-core/solutions/core-hello-part3/main.nf @@ -57,7 +57,10 @@ workflow { params.monochrome_logs, args, params.outdir, - params.input + params.input, + params.help, + params.help_full, + params.show_hidden ) // diff --git a/hello-nf-core/solutions/core-hello-part3/modules.json b/hello-nf-core/solutions/core-hello-part3/modules.json index 85169da597..71a7815a6b 100644 --- a/hello-nf-core/solutions/core-hello-part3/modules.json +++ b/hello-nf-core/solutions/core-hello-part3/modules.json @@ -16,17 +16,17 @@ "nf-core": { "utils_nextflow_pipeline": { "branch": "master", - "git_sha": "c2b22d85f30a706a3073387f30380704fcae013b", + "git_sha": "05954dab2ff481bcb999f24455da29a5828af08d", "installed_by": ["subworkflows"] }, "utils_nfcore_pipeline": { "branch": "master", - "git_sha": "51ae5406a030d4da1e49e4dab49756844fdd6c7a", + "git_sha": "05954dab2ff481bcb999f24455da29a5828af08d", "installed_by": ["subworkflows"] }, "utils_nfschema_plugin": { "branch": "master", - "git_sha": "2fd2cd6d0e7b273747f32e465fdc6bcc3ae0814e", + "git_sha": "4b406a74dc0449c0401ed87d5bfff4252fd277fd", "installed_by": ["subworkflows"] } } diff --git a/hello-nf-core/solutions/core-hello-part3/modules/local/collectGreetings.nf b/hello-nf-core/solutions/core-hello-part3/modules/local/collectGreetings.nf deleted file mode 100644 index 0274fec86f..0000000000 --- a/hello-nf-core/solutions/core-hello-part3/modules/local/collectGreetings.nf +++ /dev/null @@ -1,19 +0,0 @@ -/* - * Collect uppercase greetings into a single output file - */ -process collectGreetings { - - publishDir 'results', mode: 'copy' - - input: - path input_files - val batch_name - - output: - path "COLLECTED-${batch_name}-output.txt" , emit: outfile - - script: - """ - cat ${input_files} > 'COLLECTED-${batch_name}-output.txt' - """ -} diff --git a/hello-nf-core/solutions/core-hello-part3/nextflow.config b/hello-nf-core/solutions/core-hello-part3/nextflow.config index d633adb989..b59ba2175d 100644 --- a/hello-nf-core/solutions/core-hello-part3/nextflow.config +++ b/hello-nf-core/solutions/core-hello-part3/nextflow.config @@ -22,7 +22,9 @@ params { show_hidden = false version = false pipelines_testdata_base_path = 'https://raw.githubusercontent.com/nf-core/test-datasets/' - trace_report_suffix = new java.util.Date().format( 'yyyy-MM-dd_HH-mm-ss')// Config options + trace_report_suffix = new java.util.Date().format( 'yyyy-MM-dd_HH-mm-ss') + + // Config options config_profile_name = null config_profile_description = null @@ -75,7 +77,18 @@ profiles { apptainer.enabled = false docker.runOptions = '-u $(id -u):$(id -g)' } - arm { + arm64 { + process.arch = 'arm64' + // TODO https://github.com/nf-core/modules/issues/6694 + // For now if you're using arm64 you have to use wave for the sake of the maintainers + // wave profile + apptainer.ociAutoPull = true + singularity.ociAutoPull = true + wave.enabled = true + wave.freeze = true + wave.strategy = 'conda,container' + } + emulate_amd64 { docker.runOptions = '-u $(id -u):$(id -g) --platform=linux/amd64' } singularity { @@ -132,16 +145,24 @@ profiles { wave.freeze = true wave.strategy = 'conda,container' } + gpu { + docker.runOptions = '-u $(id -u):$(id -g) --gpus all' + apptainer.runOptions = '--nv' + singularity.runOptions = '--nv' + } test { includeConfig 'conf/test.config' } test_full { includeConfig 'conf/test_full.config' } } +// Load nf-core custom profiles from different institutions + +// If params.custom_config_base is set AND either the NXF_OFFLINE environment variable is not set or params.custom_config_base is a local path, the nfcore_custom.config file from the specified base path is included. +// Load core/hello custom profiles from different institutions. +includeConfig params.custom_config_base && (!System.getenv('NXF_OFFLINE') || !params.custom_config_base.startsWith('http')) ? "${params.custom_config_base}/nfcore_custom.config" : "/dev/null" -// Load nf-core custom profiles from different Institutions -includeConfig !System.getenv('NXF_OFFLINE') && params.custom_config_base ? "${params.custom_config_base}/nfcore_custom.config" : "/dev/null" // Load core/hello custom profiles from different institutions. // TODO nf-core: Optionally, you can add a pipeline-specific nf-core config at https://github.com/nf-core/configs -// includeConfig !System.getenv('NXF_OFFLINE') && params.custom_config_base ? "${params.custom_config_base}/pipeline/hello.config" : "/dev/null" +// includeConfig params.custom_config_base && (!System.getenv('NXF_OFFLINE') || !params.custom_config_base.startsWith('http')) ? "${params.custom_config_base}/pipeline/hello.config" : "/dev/null" // Set default registry for Apptainer, Docker, Podman, Charliecloud and Singularity independent of -profile // Will not be used unless Apptainer / Docker / Podman / Charliecloud / Singularity are enabled @@ -197,11 +218,10 @@ dag { manifest { name = 'core/hello' - author = """GG""" // The author field is deprecated from Nextflow version 24.10.0, use contributors instead contributors = [ // TODO nf-core: Update the field with the details of the contributors to your pipeline. New with Nextflow version 24.10.0 [ - name: 'GG', + name: 'pinin4fjords', affiliation: '', email: '', github: '', @@ -210,28 +230,22 @@ manifest { ], ] homePage = 'https://github.com/core/hello' - description = """basic nf-core style version of Hello Nextflow""" + description = """A basic nf-core style version of Hello Nextflow""" mainScript = 'main.nf' defaultBranch = 'main' - nextflowVersion = '!>=24.04.2' + nextflowVersion = '!>=25.04.0' version = '1.0.0dev' doi = '' } // Nextflow plugins plugins { - id 'nf-schema@2.2.0' // Validation of pipeline parameters and creation of an input channel from a sample sheet + id 'nf-schema@2.5.1' // Validation of pipeline parameters and creation of an input channel from a sample sheet } validation { defaultIgnoreParams = ["genomes"] monochromeLogs = params.monochrome_logs - help { - enabled = true - command = "nextflow run core/hello -profile --input samplesheet.csv --outdir " - fullParameter = "help_full" - showHiddenParameter = "show_hidden" - } } // Load modules.config for DSL2 module specific options diff --git a/hello-nf-core/solutions/core-hello-part3/nextflow_schema.json b/hello-nf-core/solutions/core-hello-part3/nextflow_schema.json index 5ee5ec357f..fc18ba7998 100644 --- a/hello-nf-core/solutions/core-hello-part3/nextflow_schema.json +++ b/hello-nf-core/solutions/core-hello-part3/nextflow_schema.json @@ -2,7 +2,7 @@ "$schema": "https://json-schema.org/draft/2020-12/schema", "$id": "https://raw.githubusercontent.com/core/hello/main/nextflow_schema.json", "title": "core/hello pipeline parameters", - "description": "basic nf-core style version of Hello Nextflow", + "description": "A basic nf-core style version of Hello Nextflow", "type": "object", "$defs": { "input_output_options": { @@ -133,6 +133,18 @@ "fa_icon": "far calendar", "description": "Suffix to add to the trace report filename. Default is the date and time in the format yyyy-MM-dd_HH-mm-ss.", "hidden": true + }, + "help": { + "type": ["boolean", "string"], + "description": "Display the help message." + }, + "help_full": { + "type": "boolean", + "description": "Display the full detailed help message." + }, + "show_hidden": { + "type": "boolean", + "description": "Display hidden parameters in the help message (only works when --help or --help_full are provided)." } } } diff --git a/hello-nf-core/solutions/core-hello-part3/subworkflows/local/utils_nfcore_hello_pipeline/main.nf b/hello-nf-core/solutions/core-hello-part3/subworkflows/local/utils_nfcore_hello_pipeline/main.nf index 53ba38fae8..93c9f874cc 100644 --- a/hello-nf-core/solutions/core-hello-part3/subworkflows/local/utils_nfcore_hello_pipeline/main.nf +++ b/hello-nf-core/solutions/core-hello-part3/subworkflows/local/utils_nfcore_hello_pipeline/main.nf @@ -11,6 +11,7 @@ include { UTILS_NFSCHEMA_PLUGIN } from '../../nf-core/utils_nfschema_plugin' include { paramsSummaryMap } from 'plugin/nf-schema' include { samplesheetToList } from 'plugin/nf-schema' +include { paramsHelp } from 'plugin/nf-schema' include { completionSummary } from '../../nf-core/utils_nfcore_pipeline' include { UTILS_NFCORE_PIPELINE } from '../../nf-core/utils_nfcore_pipeline' include { UTILS_NEXTFLOW_PIPELINE } from '../../nf-core/utils_nextflow_pipeline' @@ -30,6 +31,9 @@ workflow PIPELINE_INITIALISATION { nextflow_cli_args // array: List of positional nextflow CLI args outdir // string: The output directory where the results will be saved input // string: Path to input samplesheet + help // boolean: Display help message and exit + help_full // boolean: Show the full help message + show_hidden // boolean: Show hidden parameters in the help message main: @@ -48,10 +52,18 @@ workflow PIPELINE_INITIALISATION { // // Validate parameters and generate parameter summary to stdout // + command = "nextflow run ${workflow.manifest.name} -profile --input samplesheet.csv --outdir " + UTILS_NFSCHEMA_PLUGIN ( workflow, validate_params, - null + null, + help, + help_full, + show_hidden, + "", + "", + command ) // @@ -64,9 +76,10 @@ workflow PIPELINE_INITIALISATION { // // Create channel from input file provided through params.input // - ch_samplesheet = Channel.fromPath(params.input) - .splitCsv() - .map { line -> line[0] } + + ch_samplesheet = channel.fromPath(params.input) + .splitCsv() + .map { line -> line[0] } emit: samplesheet = ch_samplesheet diff --git a/hello-nf-core/solutions/core-hello-part3/subworkflows/nf-core/utils_nextflow_pipeline/tests/tags.yml b/hello-nf-core/solutions/core-hello-part3/subworkflows/nf-core/utils_nextflow_pipeline/tests/tags.yml deleted file mode 100644 index f84761125a..0000000000 --- a/hello-nf-core/solutions/core-hello-part3/subworkflows/nf-core/utils_nextflow_pipeline/tests/tags.yml +++ /dev/null @@ -1,2 +0,0 @@ -subworkflows/utils_nextflow_pipeline: - - subworkflows/nf-core/utils_nextflow_pipeline/** diff --git a/hello-nf-core/solutions/core-hello-part3/subworkflows/nf-core/utils_nfcore_pipeline/tests/tags.yml b/hello-nf-core/solutions/core-hello-part3/subworkflows/nf-core/utils_nfcore_pipeline/tests/tags.yml deleted file mode 100644 index ac8523c9a2..0000000000 --- a/hello-nf-core/solutions/core-hello-part3/subworkflows/nf-core/utils_nfcore_pipeline/tests/tags.yml +++ /dev/null @@ -1,2 +0,0 @@ -subworkflows/utils_nfcore_pipeline: - - subworkflows/nf-core/utils_nfcore_pipeline/** diff --git a/hello-nf-core/solutions/core-hello-part3/subworkflows/nf-core/utils_nfschema_plugin/main.nf b/hello-nf-core/solutions/core-hello-part3/subworkflows/nf-core/utils_nfschema_plugin/main.nf index 93de2a5245..acb3972419 100644 --- a/hello-nf-core/solutions/core-hello-part3/subworkflows/nf-core/utils_nfschema_plugin/main.nf +++ b/hello-nf-core/solutions/core-hello-part3/subworkflows/nf-core/utils_nfschema_plugin/main.nf @@ -4,6 +4,7 @@ include { paramsSummaryLog } from 'plugin/nf-schema' include { validateParameters } from 'plugin/nf-schema' +include { paramsHelp } from 'plugin/nf-schema' workflow UTILS_NFSCHEMA_PLUGIN { @@ -15,29 +16,56 @@ workflow UTILS_NFSCHEMA_PLUGIN { // when this input is empty it will automatically use the configured schema or // "${projectDir}/nextflow_schema.json" as default. This input should not be empty // for meta pipelines + help // boolean: show help message + help_full // boolean: show full help message + show_hidden // boolean: show hidden parameters in help message + before_text // string: text to show before the help message and parameters summary + after_text // string: text to show after the help message and parameters summary + command // string: an example command of the pipeline main: + if(help || help_full) { + help_options = [ + beforeText: before_text, + afterText: after_text, + command: command, + showHidden: show_hidden, + fullHelp: help_full, + ] + if(parameters_schema) { + help_options << [parametersSchema: parameters_schema] + } + log.info paramsHelp( + help_options, + params.help instanceof String ? params.help : "", + ) + exit 0 + } + // // Print parameter summary to stdout. This will display the parameters // that differ from the default given in the JSON schema // + + summary_options = [:] if(parameters_schema) { - log.info paramsSummaryLog(input_workflow, parameters_schema:parameters_schema) - } else { - log.info paramsSummaryLog(input_workflow) + summary_options << [parametersSchema: parameters_schema] } + log.info before_text + log.info paramsSummaryLog(summary_options, input_workflow) + log.info after_text // // Validate the parameters using nextflow_schema.json or the schema // given via the validation.parametersSchema configuration option // if(validate_params) { + validateOptions = [:] if(parameters_schema) { - validateParameters(parameters_schema:parameters_schema) - } else { - validateParameters() + validateOptions << [parametersSchema: parameters_schema] } + validateParameters(validateOptions) } emit: diff --git a/hello-nf-core/solutions/core-hello-part3/subworkflows/nf-core/utils_nfschema_plugin/tests/main.nf.test b/hello-nf-core/solutions/core-hello-part3/subworkflows/nf-core/utils_nfschema_plugin/tests/main.nf.test index 8fb3016487..c977917aac 100644 --- a/hello-nf-core/solutions/core-hello-part3/subworkflows/nf-core/utils_nfschema_plugin/tests/main.nf.test +++ b/hello-nf-core/solutions/core-hello-part3/subworkflows/nf-core/utils_nfschema_plugin/tests/main.nf.test @@ -25,6 +25,12 @@ nextflow_workflow { input[0] = workflow input[1] = validate_params input[2] = "" + input[3] = false + input[4] = false + input[5] = false + input[6] = "" + input[7] = "" + input[8] = "" """ } } @@ -51,6 +57,12 @@ nextflow_workflow { input[0] = workflow input[1] = validate_params input[2] = "" + input[3] = false + input[4] = false + input[5] = false + input[6] = "" + input[7] = "" + input[8] = "" """ } } @@ -77,6 +89,12 @@ nextflow_workflow { input[0] = workflow input[1] = validate_params input[2] = "${projectDir}/subworkflows/nf-core/utils_nfschema_plugin/tests/nextflow_schema.json" + input[3] = false + input[4] = false + input[5] = false + input[6] = "" + input[7] = "" + input[8] = "" """ } } @@ -103,6 +121,12 @@ nextflow_workflow { input[0] = workflow input[1] = validate_params input[2] = "${projectDir}/subworkflows/nf-core/utils_nfschema_plugin/tests/nextflow_schema.json" + input[3] = false + input[4] = false + input[5] = false + input[6] = "" + input[7] = "" + input[8] = "" """ } } @@ -114,4 +138,36 @@ nextflow_workflow { ) } } + + test("Should create a help message") { + + when { + + params { + test_data = '' + outdir = null + } + + workflow { + """ + validate_params = true + input[0] = workflow + input[1] = validate_params + input[2] = "${projectDir}/subworkflows/nf-core/utils_nfschema_plugin/tests/nextflow_schema.json" + input[3] = true + input[4] = false + input[5] = false + input[6] = "Before" + input[7] = "After" + input[8] = "nextflow run test/test" + """ + } + } + + then { + assertAll( + { assert workflow.success } + ) + } + } } diff --git a/hello-nf-core/solutions/core-hello-part3/subworkflows/nf-core/utils_nfschema_plugin/tests/nextflow.config b/hello-nf-core/solutions/core-hello-part3/subworkflows/nf-core/utils_nfschema_plugin/tests/nextflow.config index 478fb8a05f..8d8c73718a 100644 --- a/hello-nf-core/solutions/core-hello-part3/subworkflows/nf-core/utils_nfschema_plugin/tests/nextflow.config +++ b/hello-nf-core/solutions/core-hello-part3/subworkflows/nf-core/utils_nfschema_plugin/tests/nextflow.config @@ -1,5 +1,5 @@ plugins { - id "nf-schema@2.1.0" + id "nf-schema@2.5.1" } validation { diff --git a/hello-nf-core/solutions/core-hello-part3/workflows/hello.nf b/hello-nf-core/solutions/core-hello-part3/workflows/hello.nf index 170a754fa9..1549833cdc 100644 --- a/hello-nf-core/solutions/core-hello-part3/workflows/hello.nf +++ b/hello-nf-core/solutions/core-hello-part3/workflows/hello.nf @@ -20,19 +20,23 @@ workflow HELLO { take: ch_samplesheet // channel: samplesheet read in from --input + main: + ch_versions = Channel.empty() + // emit a greeting sayHello(ch_samplesheet) // convert the greeting to uppercase convertToUpper(sayHello.out) - // collect all the greetings into one file using nf-core cat/cat module // create metadata map with batch name as the ID def cat_meta = [ id: params.batch ] + // create a channel with metadata and files in tuple format ch_for_cat = convertToUpper.out.collect().map { files -> tuple(cat_meta, files) } + // concatenate files using the nf-core cat/cat module CAT_CAT(ch_for_cat) // generate ASCII art of the greetings with cowpy @@ -40,8 +44,6 @@ workflow HELLO { ch_for_cowpy = CAT_CAT.out.file_out.map{ meta, file -> file } cowpy(ch_for_cowpy, params.character) - ch_versions = Channel.empty() - // // Collate and save software versions // diff --git a/hello-nf-core/solutions/core-hello-part4/.nf-core.yml b/hello-nf-core/solutions/core-hello-part4/.nf-core.yml new file mode 100644 index 0000000000..1a638d8288 --- /dev/null +++ b/hello-nf-core/solutions/core-hello-part4/.nf-core.yml @@ -0,0 +1,106 @@ +repository_type: pipeline + +nf_core_version: 3.4.1 + +lint: + files_unchanged: + - .github/CONTRIBUTING.md + - .prettierignore + - .prettierignore + - .prettierignore + - CODE_OF_CONDUCT.md + - assets/nf-core-hello_logo_light.png + - docs/images/nf-core-hello_logo_light.png + - docs/images/nf-core-hello_logo_dark.png + - .github/ISSUE_TEMPLATE/bug_report.yml + - .github/CONTRIBUTING.md + - .github/PULL_REQUEST_TEMPLATE.md + - assets/email_template.txt + - docs/README.md + - .github/ISSUE_TEMPLATE/bug_report.yml + - .github/ISSUE_TEMPLATE/config.yml + - .github/ISSUE_TEMPLATE/feature_request.yml + - .github/PULL_REQUEST_TEMPLATE.md + - .github/workflows/branch.yml + - .github/workflows/linting_comment.yml + - .github/workflows/linting.yml + - .github/CONTRIBUTING.md + - .github/.dockstore.yml + - .github/CONTRIBUTING.md + - assets/sendmail_template.txt + - .prettierignore + - LICENSE + nextflow_config: + - manifest.name + - manifest.homePage + nf_test_content: false + multiqc_config: false + files_exist: + - .github/workflows/nf-test.yml + - .github/actions/get-shards/action.yml + - .github/actions/nf-test/action.yml + - nf-test.config + - tests/default.nf.test + - assets/email_template.html + - assets/sendmail_template.txt + - assets/email_template.txt + - CODE_OF_CONDUCT.md + - assets/nf-core-hello_logo_light.png + - docs/images/nf-core-hello_logo_light.png + - docs/images/nf-core-hello_logo_dark.png + - .github/ISSUE_TEMPLATE/config.yml + - .github/workflows/awstest.yml + - .github/workflows/awsfulltest.yml + - .github/ISSUE_TEMPLATE/bug_report.yml + - .github/ISSUE_TEMPLATE/feature_request.yml + - .github/PULL_REQUEST_TEMPLATE.md + - .github/CONTRIBUTING.md + - .github/.dockstore.yml + - CHANGELOG.md + - assets/multiqc_config.yml + - .github/workflows/branch.yml + - .github/workflows/nf-test.yml + - .github/actions/get-shards/action.yml + - .github/actions/nf-test/action.yml + - .github/workflows/linting_comment.yml + - .github/workflows/linting.yml + - .prettierignore + - .prettierrc.yml + - conf/igenomes.config + - conf/igenomes_ignored.config + - CITATIONS.md + - LICENSE + readme: + - nextflow_badge + - nextflow_badge + - nfcore_template_badge + +template: + org: core + name: hello + description: A basic nf-core style version of Hello Nextflow + author: pinin4fjords + version: 1.0.0dev + force: true + outdir: . + skip_features: + - github + - github_badges + - changelog + - license + - ci + - nf-test + - igenomes + - multiqc + - fastqc + - seqera_platform + - gpu + - codespaces + - vscode + - code_linters + - citations + - rocrate + - email + - adaptivecard + - slackreport + is_nfcore: false diff --git a/hello-nf-core/solutions/core-hello-part4/README.md b/hello-nf-core/solutions/core-hello-part4/README.md index 0a533c4c4b..94844f6c5e 100644 --- a/hello-nf-core/solutions/core-hello-part4/README.md +++ b/hello-nf-core/solutions/core-hello-part4/README.md @@ -11,7 +11,7 @@ --> + workflows use the "tube map" design for that. See https://nf-co.re/docs/guidelines/graphic_design/workflow_diagrams#examples for examples. --> ## Usage @@ -51,7 +51,7 @@ nextflow run core/hello \ ## Credits -core/hello was originally written by GG. +core/hello was originally written by pinin4fjords. We thank the following people for their extensive assistance in the development of this pipeline: diff --git a/hello-nf-core/solutions/core-hello-part4/assets/greetings.csv b/hello-nf-core/solutions/core-hello-part4/assets/greetings.csv index f5c9849604..c5889e19a7 100644 --- a/hello-nf-core/solutions/core-hello-part4/assets/greetings.csv +++ b/hello-nf-core/solutions/core-hello-part4/assets/greetings.csv @@ -1,4 +1,3 @@ -greeting Hello Bonjour Holà diff --git a/hello-nf-core/solutions/core-hello-part4/assets/schema_input.json b/hello-nf-core/solutions/core-hello-part4/assets/schema_input.json index c7a4df997d..5cb7458161 100644 --- a/hello-nf-core/solutions/core-hello-part4/assets/schema_input.json +++ b/hello-nf-core/solutions/core-hello-part4/assets/schema_input.json @@ -2,17 +2,32 @@ "$schema": "https://json-schema.org/draft/2020-12/schema", "$id": "https://raw.githubusercontent.com/core/hello/main/assets/schema_input.json", "title": "core/hello pipeline - params.input schema", - "description": "Schema for the greetings file provided with params.input", + "description": "Schema for the file provided with params.input", "type": "array", "items": { "type": "object", "properties": { - "greeting": { + "sample": { "type": "string", - "pattern": "^\\S.*$", - "errorMessage": "Greeting must be provided and cannot be empty or start with whitespace" + "pattern": "^\\S+$", + "errorMessage": "Sample name must be provided and cannot contain spaces", + "meta": ["id"] + }, + "fastq_1": { + "type": "string", + "format": "file-path", + "exists": true, + "pattern": "^([\\S\\s]*\\/)?[^\\s\\/]+\\.f(ast)?q\\.gz$", + "errorMessage": "FastQ file for reads 1 must be provided, cannot contain spaces and must have extension '.fq.gz' or '.fastq.gz'" + }, + "fastq_2": { + "type": "string", + "format": "file-path", + "exists": true, + "pattern": "^([\\S\\s]*\\/)?[^\\s\\/]+\\.f(ast)?q\\.gz$", + "errorMessage": "FastQ file for reads 2 cannot contain spaces and must have extension '.fq.gz' or '.fastq.gz'" } }, - "required": ["greeting"] + "required": ["sample", "fastq_1"] } } diff --git a/hello-nf-core/solutions/core-hello-part4/conf/base.config b/hello-nf-core/solutions/core-hello-part4/conf/base.config index 1abcd9876f..e0fe40762f 100644 --- a/hello-nf-core/solutions/core-hello-part4/conf/base.config +++ b/hello-nf-core/solutions/core-hello-part4/conf/base.config @@ -15,7 +15,7 @@ process { memory = { 6.GB * task.attempt } time = { 4.h * task.attempt } - errorStrategy = { task.exitStatus in ((130..145) + 104) ? 'retry' : 'finish' } + errorStrategy = { task.exitStatus in ((130..145) + 104 + 175) ? 'retry' : 'finish' } maxRetries = 1 maxErrors = '-1' @@ -59,4 +59,8 @@ process { errorStrategy = 'retry' maxRetries = 2 } + withLabel: process_gpu { + ext.use_gpu = { workflow.profile.contains('gpu') } + accelerator = { workflow.profile.contains('gpu') ? 1 : null } + } } diff --git a/hello-nf-core/solutions/core-hello-part4/conf/test.config b/hello-nf-core/solutions/core-hello-part4/conf/test.config index f82761298d..13ecf2ad4b 100644 --- a/hello-nf-core/solutions/core-hello-part4/conf/test.config +++ b/hello-nf-core/solutions/core-hello-part4/conf/test.config @@ -12,8 +12,9 @@ process { resourceLimits = [ - cpus: 1, - memory: '1.GB' + cpus: 2, + memory: '4.GB', + time: '1.h' ] } diff --git a/hello-nf-core/solutions/core-hello-part4/conf/test_full.config b/hello-nf-core/solutions/core-hello-part4/conf/test_full.config new file mode 100644 index 0000000000..ceeaf40cac --- /dev/null +++ b/hello-nf-core/solutions/core-hello-part4/conf/test_full.config @@ -0,0 +1,24 @@ +/* +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + Nextflow config file for running full-size tests +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + Defines input files and everything required to run a full size pipeline test. + + Use as follows: + nextflow run core/hello -profile test_full, --outdir + +---------------------------------------------------------------------------------------- +*/ + +params { + config_profile_name = 'Full test profile' + config_profile_description = 'Full test dataset to check pipeline function' + + // Input data for full size test + // TODO nf-core: Specify the paths to your full test data ( on nf-core/test-datasets or directly in repositories, e.g. SRA) + // TODO nf-core: Give any required params for the test so that command line flags are not needed + input = params.pipelines_testdata_base_path + 'viralrecon/samplesheet/samplesheet_full_illumina_amplicon.csv' + + // Fasta references + fasta = params.pipelines_testdata_base_path + 'viralrecon/genome/NC_045512.2/GCF_009858895.2_ASM985889v3_genomic.200409.fna.gz' +} diff --git a/hello-nf-core/solutions/core-hello-part4/docs/usage.md b/hello-nf-core/solutions/core-hello-part4/docs/usage.md index 78b55f9afe..bfbc37ab42 100644 --- a/hello-nf-core/solutions/core-hello-part4/docs/usage.md +++ b/hello-nf-core/solutions/core-hello-part4/docs/usage.md @@ -146,7 +146,7 @@ If `-profile` is not specified, the pipeline will run locally and expect all sof - `shifter` - A generic configuration profile to be used with [Shifter](https://nersc.gitlab.io/development/shifter/how-to-use/) - `charliecloud` - - A generic configuration profile to be used with [Charliecloud](https://hpc.github.io/charliecloud/) + - A generic configuration profile to be used with [Charliecloud](https://charliecloud.io/) - `apptainer` - A generic configuration profile to be used with [Apptainer](https://apptainer.org/) - `wave` diff --git a/hello-nf-core/solutions/core-hello-part4/main.nf b/hello-nf-core/solutions/core-hello-part4/main.nf index f72a236660..eb8d91361f 100644 --- a/hello-nf-core/solutions/core-hello-part4/main.nf +++ b/hello-nf-core/solutions/core-hello-part4/main.nf @@ -57,7 +57,10 @@ workflow { params.monochrome_logs, args, params.outdir, - params.input + params.input, + params.help, + params.help_full, + params.show_hidden ) // diff --git a/hello-nf-core/solutions/core-hello-part4/modules.json b/hello-nf-core/solutions/core-hello-part4/modules.json index 85169da597..71a7815a6b 100644 --- a/hello-nf-core/solutions/core-hello-part4/modules.json +++ b/hello-nf-core/solutions/core-hello-part4/modules.json @@ -16,17 +16,17 @@ "nf-core": { "utils_nextflow_pipeline": { "branch": "master", - "git_sha": "c2b22d85f30a706a3073387f30380704fcae013b", + "git_sha": "05954dab2ff481bcb999f24455da29a5828af08d", "installed_by": ["subworkflows"] }, "utils_nfcore_pipeline": { "branch": "master", - "git_sha": "51ae5406a030d4da1e49e4dab49756844fdd6c7a", + "git_sha": "05954dab2ff481bcb999f24455da29a5828af08d", "installed_by": ["subworkflows"] }, "utils_nfschema_plugin": { "branch": "master", - "git_sha": "2fd2cd6d0e7b273747f32e465fdc6bcc3ae0814e", + "git_sha": "4b406a74dc0449c0401ed87d5bfff4252fd277fd", "installed_by": ["subworkflows"] } } diff --git a/hello-nf-core/solutions/core-hello-part4/modules/local/collectGreetings.nf b/hello-nf-core/solutions/core-hello-part4/modules/local/collectGreetings.nf deleted file mode 100644 index 0274fec86f..0000000000 --- a/hello-nf-core/solutions/core-hello-part4/modules/local/collectGreetings.nf +++ /dev/null @@ -1,19 +0,0 @@ -/* - * Collect uppercase greetings into a single output file - */ -process collectGreetings { - - publishDir 'results', mode: 'copy' - - input: - path input_files - val batch_name - - output: - path "COLLECTED-${batch_name}-output.txt" , emit: outfile - - script: - """ - cat ${input_files} > 'COLLECTED-${batch_name}-output.txt' - """ -} diff --git a/hello-nf-core/solutions/core-hello-part4/modules/local/cowpy/environment.yml b/hello-nf-core/solutions/core-hello-part4/modules/local/cowpy/environment.yml new file mode 100644 index 0000000000..32bc330d8f --- /dev/null +++ b/hello-nf-core/solutions/core-hello-part4/modules/local/cowpy/environment.yml @@ -0,0 +1,10 @@ +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json +channels: + - conda-forge + - bioconda +dependencies: + # TODO nf-core: List required Conda package(s). + # Software MUST be pinned to channel (i.e. "bioconda"), version (i.e. "1.10"). + # For Conda, the build (i.e. "h9402c20_2") must be EXCLUDED to support installation on different operating systems. + - "YOUR-TOOL-HERE" diff --git a/hello-nf-core/solutions/core-hello-part4/modules/local/cowpy/main.nf b/hello-nf-core/solutions/core-hello-part4/modules/local/cowpy/main.nf new file mode 100644 index 0000000000..1b30633471 --- /dev/null +++ b/hello-nf-core/solutions/core-hello-part4/modules/local/cowpy/main.nf @@ -0,0 +1,47 @@ + + +process COWPY { + tag "$meta.id" + label 'process_single' + + conda "${moduleDir}/environment.yml" + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/YOUR-TOOL-HERE': + 'biocontainers/YOUR-TOOL-HERE' }" + + input:tuple val(meta), path(input) + + output: + tuple val(meta), path("*"), emit: output + path "versions.yml" , emit: versions + + when: + task.ext.when == null || task.ext.when + + script: + def args = task.ext.args ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" + + """ + + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + cowpy: \$(cowpy --version) + END_VERSIONS + """ + + stub: + def args = task.ext.args ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" + + """ + echo $args + + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + cowpy: \$(cowpy --version) + END_VERSIONS + """ +} diff --git a/hello-nf-core/solutions/core-hello-part4/modules/local/cowpy/meta.yml b/hello-nf-core/solutions/core-hello-part4/modules/local/cowpy/meta.yml new file mode 100644 index 0000000000..616fdd9422 --- /dev/null +++ b/hello-nf-core/solutions/core-hello-part4/modules/local/cowpy/meta.yml @@ -0,0 +1,51 @@ +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/meta-schema.json +name: "cowpy" +description: write your description here +keywords: + - sort + - example + - genomics +tools: + - "cowpy": + description: "" + homepage: "" + documentation: "" + tool_dev_url: "" + doi: "" + licence: null + identifier: null + +input: + - - meta: + type: map + description: Groovy Map containing sample information. e.g. `[ + id:'sample1' ]` + - input: + type: file + description: "" + pattern: "" + ontologies: + - edam: "" +output: + output: + - - meta: + type: map + description: Groovy Map containing sample information. e.g. `[ + id:'sample1' ]` + - "*": + type: file + description: "" + pattern: "" + ontologies: + - edam: "" + versions: + - versions.yml: + type: file + description: File containing software versions + pattern: versions.yml + ontologies: + - edam: http://edamontology.org/format_3750 # YAML +authors: + - "@example" +maintainers: + - "@example" diff --git a/hello-nf-core/solutions/core-hello-part4/modules/local/cowpy/tests/main.nf.test b/hello-nf-core/solutions/core-hello-part4/modules/local/cowpy/tests/main.nf.test new file mode 100644 index 0000000000..bd9c6e4767 --- /dev/null +++ b/hello-nf-core/solutions/core-hello-part4/modules/local/cowpy/tests/main.nf.test @@ -0,0 +1,80 @@ +// TODO nf-core: Once you have added the required tests, please run the following command to build this file: +// nf-core modules test cowpy +nextflow_process { + + name "Test Process COWPY" + script "../main.nf" + process "COWPY" + + tag "modules" + tag "modules_" + tag "cowpy" + + // TODO nf-core: Change the test name preferably indicating the test-data and file-format used + test("sarscov2 - bam") { + + // TODO nf-core: If you are created a test for a chained module + // (the module requires running more than one process to generate the required output) + // add the 'setup' method here. + // You can find more information about how to use a 'setup' method in the docs (https://nf-co.re/docs/contributing/modules#steps-for-creating-nf-test-for-chained-modules). + + when { + process { + """ + // TODO nf-core: define inputs of the process here. Example: + + + input[0] = [ + [ id:'test' ], + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.paired_end.sorted.bam', checkIfExists: true), + ] + """ + } + } + + then { + assert process.success + assertAll( + { assert snapshot( + process.out, + path(process.out.versions[0]).yaml + ).match() } + //TODO nf-core: Add all required assertions to verify the test output. + // See https://nf-co.re/docs/contributing/tutorials/nf-test_assertions for more information and examples. + ) + } + + } + + // TODO nf-core: Change the test name preferably indicating the test-data and file-format used but keep the " - stub" suffix. + test("sarscov2 - bam - stub") { + + options "-stub" + + when { + process { + """ + // TODO nf-core: define inputs of the process here. Example: + + + input[0] = [ + [ id:'test' ], + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.paired_end.sorted.bam', checkIfExists: true), + ] + """ + } + } + + then { + assert process.success + assertAll( + { assert snapshot( + process.out, + path(process.out.versions[0]).yaml + ).match() } + ) + } + + } + +} diff --git a/hello-nf-core/solutions/core-hello-part4/nextflow.config b/hello-nf-core/solutions/core-hello-part4/nextflow.config index d633adb989..b59ba2175d 100644 --- a/hello-nf-core/solutions/core-hello-part4/nextflow.config +++ b/hello-nf-core/solutions/core-hello-part4/nextflow.config @@ -22,7 +22,9 @@ params { show_hidden = false version = false pipelines_testdata_base_path = 'https://raw.githubusercontent.com/nf-core/test-datasets/' - trace_report_suffix = new java.util.Date().format( 'yyyy-MM-dd_HH-mm-ss')// Config options + trace_report_suffix = new java.util.Date().format( 'yyyy-MM-dd_HH-mm-ss') + + // Config options config_profile_name = null config_profile_description = null @@ -75,7 +77,18 @@ profiles { apptainer.enabled = false docker.runOptions = '-u $(id -u):$(id -g)' } - arm { + arm64 { + process.arch = 'arm64' + // TODO https://github.com/nf-core/modules/issues/6694 + // For now if you're using arm64 you have to use wave for the sake of the maintainers + // wave profile + apptainer.ociAutoPull = true + singularity.ociAutoPull = true + wave.enabled = true + wave.freeze = true + wave.strategy = 'conda,container' + } + emulate_amd64 { docker.runOptions = '-u $(id -u):$(id -g) --platform=linux/amd64' } singularity { @@ -132,16 +145,24 @@ profiles { wave.freeze = true wave.strategy = 'conda,container' } + gpu { + docker.runOptions = '-u $(id -u):$(id -g) --gpus all' + apptainer.runOptions = '--nv' + singularity.runOptions = '--nv' + } test { includeConfig 'conf/test.config' } test_full { includeConfig 'conf/test_full.config' } } +// Load nf-core custom profiles from different institutions + +// If params.custom_config_base is set AND either the NXF_OFFLINE environment variable is not set or params.custom_config_base is a local path, the nfcore_custom.config file from the specified base path is included. +// Load core/hello custom profiles from different institutions. +includeConfig params.custom_config_base && (!System.getenv('NXF_OFFLINE') || !params.custom_config_base.startsWith('http')) ? "${params.custom_config_base}/nfcore_custom.config" : "/dev/null" -// Load nf-core custom profiles from different Institutions -includeConfig !System.getenv('NXF_OFFLINE') && params.custom_config_base ? "${params.custom_config_base}/nfcore_custom.config" : "/dev/null" // Load core/hello custom profiles from different institutions. // TODO nf-core: Optionally, you can add a pipeline-specific nf-core config at https://github.com/nf-core/configs -// includeConfig !System.getenv('NXF_OFFLINE') && params.custom_config_base ? "${params.custom_config_base}/pipeline/hello.config" : "/dev/null" +// includeConfig params.custom_config_base && (!System.getenv('NXF_OFFLINE') || !params.custom_config_base.startsWith('http')) ? "${params.custom_config_base}/pipeline/hello.config" : "/dev/null" // Set default registry for Apptainer, Docker, Podman, Charliecloud and Singularity independent of -profile // Will not be used unless Apptainer / Docker / Podman / Charliecloud / Singularity are enabled @@ -197,11 +218,10 @@ dag { manifest { name = 'core/hello' - author = """GG""" // The author field is deprecated from Nextflow version 24.10.0, use contributors instead contributors = [ // TODO nf-core: Update the field with the details of the contributors to your pipeline. New with Nextflow version 24.10.0 [ - name: 'GG', + name: 'pinin4fjords', affiliation: '', email: '', github: '', @@ -210,28 +230,22 @@ manifest { ], ] homePage = 'https://github.com/core/hello' - description = """basic nf-core style version of Hello Nextflow""" + description = """A basic nf-core style version of Hello Nextflow""" mainScript = 'main.nf' defaultBranch = 'main' - nextflowVersion = '!>=24.04.2' + nextflowVersion = '!>=25.04.0' version = '1.0.0dev' doi = '' } // Nextflow plugins plugins { - id 'nf-schema@2.2.0' // Validation of pipeline parameters and creation of an input channel from a sample sheet + id 'nf-schema@2.5.1' // Validation of pipeline parameters and creation of an input channel from a sample sheet } validation { defaultIgnoreParams = ["genomes"] monochromeLogs = params.monochrome_logs - help { - enabled = true - command = "nextflow run core/hello -profile --input samplesheet.csv --outdir " - fullParameter = "help_full" - showHiddenParameter = "show_hidden" - } } // Load modules.config for DSL2 module specific options diff --git a/hello-nf-core/solutions/core-hello-part4/nextflow_schema.json b/hello-nf-core/solutions/core-hello-part4/nextflow_schema.json index 93bd733448..fc18ba7998 100644 --- a/hello-nf-core/solutions/core-hello-part4/nextflow_schema.json +++ b/hello-nf-core/solutions/core-hello-part4/nextflow_schema.json @@ -2,7 +2,7 @@ "$schema": "https://json-schema.org/draft/2020-12/schema", "$id": "https://raw.githubusercontent.com/core/hello/main/nextflow_schema.json", "title": "core/hello pipeline parameters", - "description": "basic nf-core style version of Hello Nextflow", + "description": "A basic nf-core style version of Hello Nextflow", "type": "object", "$defs": { "input_output_options": { @@ -23,11 +23,6 @@ "help_text": "You will need to create a design file with information about the samples in your experiment before running the pipeline. Use this parameter to specify its location. It has to be a comma-separated file with 3 columns, and a header row.", "fa_icon": "fas fa-file-csv" }, - "batch": { - "type": "string", - "default": "batch-01", - "description": "Name for this batch of greetings" - }, "outdir": { "type": "string", "format": "directory-path", @@ -138,6 +133,18 @@ "fa_icon": "far calendar", "description": "Suffix to add to the trace report filename. Default is the date and time in the format yyyy-MM-dd_HH-mm-ss.", "hidden": true + }, + "help": { + "type": ["boolean", "string"], + "description": "Display the help message." + }, + "help_full": { + "type": "boolean", + "description": "Display the full detailed help message." + }, + "show_hidden": { + "type": "boolean", + "description": "Display hidden parameters in the help message (only works when --help or --help_full are provided)." } } } diff --git a/hello-nf-core/solutions/core-hello-part4/subworkflows/local/utils_nfcore_hello_pipeline/main.nf b/hello-nf-core/solutions/core-hello-part4/subworkflows/local/utils_nfcore_hello_pipeline/main.nf index 8882631372..93c9f874cc 100644 --- a/hello-nf-core/solutions/core-hello-part4/subworkflows/local/utils_nfcore_hello_pipeline/main.nf +++ b/hello-nf-core/solutions/core-hello-part4/subworkflows/local/utils_nfcore_hello_pipeline/main.nf @@ -11,6 +11,7 @@ include { UTILS_NFSCHEMA_PLUGIN } from '../../nf-core/utils_nfschema_plugin' include { paramsSummaryMap } from 'plugin/nf-schema' include { samplesheetToList } from 'plugin/nf-schema' +include { paramsHelp } from 'plugin/nf-schema' include { completionSummary } from '../../nf-core/utils_nfcore_pipeline' include { UTILS_NFCORE_PIPELINE } from '../../nf-core/utils_nfcore_pipeline' include { UTILS_NEXTFLOW_PIPELINE } from '../../nf-core/utils_nextflow_pipeline' @@ -30,6 +31,9 @@ workflow PIPELINE_INITIALISATION { nextflow_cli_args // array: List of positional nextflow CLI args outdir // string: The output directory where the results will be saved input // string: Path to input samplesheet + help // boolean: Display help message and exit + help_full // boolean: Show the full help message + show_hidden // boolean: Show hidden parameters in the help message main: @@ -48,10 +52,18 @@ workflow PIPELINE_INITIALISATION { // // Validate parameters and generate parameter summary to stdout // + command = "nextflow run ${workflow.manifest.name} -profile --input samplesheet.csv --outdir " + UTILS_NFSCHEMA_PLUGIN ( workflow, validate_params, - null + null, + help, + help_full, + show_hidden, + "", + "", + command ) // @@ -64,11 +76,10 @@ workflow PIPELINE_INITIALISATION { // // Create channel from input file provided through params.input // - ch_samplesheet = Channel.fromList(samplesheetToList(params.input, "${projectDir}/assets/schema_input.json")) - .map { row -> - // Extract just the greeting string from each row - row[0] - } + + ch_samplesheet = channel.fromPath(params.input) + .splitCsv() + .map { line -> line[0] } emit: samplesheet = ch_samplesheet diff --git a/hello-nf-core/solutions/core-hello-part4/subworkflows/nf-core/utils_nextflow_pipeline/tests/tags.yml b/hello-nf-core/solutions/core-hello-part4/subworkflows/nf-core/utils_nextflow_pipeline/tests/tags.yml deleted file mode 100644 index f84761125a..0000000000 --- a/hello-nf-core/solutions/core-hello-part4/subworkflows/nf-core/utils_nextflow_pipeline/tests/tags.yml +++ /dev/null @@ -1,2 +0,0 @@ -subworkflows/utils_nextflow_pipeline: - - subworkflows/nf-core/utils_nextflow_pipeline/** diff --git a/hello-nf-core/solutions/core-hello-part4/subworkflows/nf-core/utils_nfcore_pipeline/tests/tags.yml b/hello-nf-core/solutions/core-hello-part4/subworkflows/nf-core/utils_nfcore_pipeline/tests/tags.yml deleted file mode 100644 index ac8523c9a2..0000000000 --- a/hello-nf-core/solutions/core-hello-part4/subworkflows/nf-core/utils_nfcore_pipeline/tests/tags.yml +++ /dev/null @@ -1,2 +0,0 @@ -subworkflows/utils_nfcore_pipeline: - - subworkflows/nf-core/utils_nfcore_pipeline/** diff --git a/hello-nf-core/solutions/core-hello-part4/subworkflows/nf-core/utils_nfschema_plugin/main.nf b/hello-nf-core/solutions/core-hello-part4/subworkflows/nf-core/utils_nfschema_plugin/main.nf index 93de2a5245..acb3972419 100644 --- a/hello-nf-core/solutions/core-hello-part4/subworkflows/nf-core/utils_nfschema_plugin/main.nf +++ b/hello-nf-core/solutions/core-hello-part4/subworkflows/nf-core/utils_nfschema_plugin/main.nf @@ -4,6 +4,7 @@ include { paramsSummaryLog } from 'plugin/nf-schema' include { validateParameters } from 'plugin/nf-schema' +include { paramsHelp } from 'plugin/nf-schema' workflow UTILS_NFSCHEMA_PLUGIN { @@ -15,29 +16,56 @@ workflow UTILS_NFSCHEMA_PLUGIN { // when this input is empty it will automatically use the configured schema or // "${projectDir}/nextflow_schema.json" as default. This input should not be empty // for meta pipelines + help // boolean: show help message + help_full // boolean: show full help message + show_hidden // boolean: show hidden parameters in help message + before_text // string: text to show before the help message and parameters summary + after_text // string: text to show after the help message and parameters summary + command // string: an example command of the pipeline main: + if(help || help_full) { + help_options = [ + beforeText: before_text, + afterText: after_text, + command: command, + showHidden: show_hidden, + fullHelp: help_full, + ] + if(parameters_schema) { + help_options << [parametersSchema: parameters_schema] + } + log.info paramsHelp( + help_options, + params.help instanceof String ? params.help : "", + ) + exit 0 + } + // // Print parameter summary to stdout. This will display the parameters // that differ from the default given in the JSON schema // + + summary_options = [:] if(parameters_schema) { - log.info paramsSummaryLog(input_workflow, parameters_schema:parameters_schema) - } else { - log.info paramsSummaryLog(input_workflow) + summary_options << [parametersSchema: parameters_schema] } + log.info before_text + log.info paramsSummaryLog(summary_options, input_workflow) + log.info after_text // // Validate the parameters using nextflow_schema.json or the schema // given via the validation.parametersSchema configuration option // if(validate_params) { + validateOptions = [:] if(parameters_schema) { - validateParameters(parameters_schema:parameters_schema) - } else { - validateParameters() + validateOptions << [parametersSchema: parameters_schema] } + validateParameters(validateOptions) } emit: diff --git a/hello-nf-core/solutions/core-hello-part4/subworkflows/nf-core/utils_nfschema_plugin/tests/main.nf.test b/hello-nf-core/solutions/core-hello-part4/subworkflows/nf-core/utils_nfschema_plugin/tests/main.nf.test index 8fb3016487..c977917aac 100644 --- a/hello-nf-core/solutions/core-hello-part4/subworkflows/nf-core/utils_nfschema_plugin/tests/main.nf.test +++ b/hello-nf-core/solutions/core-hello-part4/subworkflows/nf-core/utils_nfschema_plugin/tests/main.nf.test @@ -25,6 +25,12 @@ nextflow_workflow { input[0] = workflow input[1] = validate_params input[2] = "" + input[3] = false + input[4] = false + input[5] = false + input[6] = "" + input[7] = "" + input[8] = "" """ } } @@ -51,6 +57,12 @@ nextflow_workflow { input[0] = workflow input[1] = validate_params input[2] = "" + input[3] = false + input[4] = false + input[5] = false + input[6] = "" + input[7] = "" + input[8] = "" """ } } @@ -77,6 +89,12 @@ nextflow_workflow { input[0] = workflow input[1] = validate_params input[2] = "${projectDir}/subworkflows/nf-core/utils_nfschema_plugin/tests/nextflow_schema.json" + input[3] = false + input[4] = false + input[5] = false + input[6] = "" + input[7] = "" + input[8] = "" """ } } @@ -103,6 +121,12 @@ nextflow_workflow { input[0] = workflow input[1] = validate_params input[2] = "${projectDir}/subworkflows/nf-core/utils_nfschema_plugin/tests/nextflow_schema.json" + input[3] = false + input[4] = false + input[5] = false + input[6] = "" + input[7] = "" + input[8] = "" """ } } @@ -114,4 +138,36 @@ nextflow_workflow { ) } } + + test("Should create a help message") { + + when { + + params { + test_data = '' + outdir = null + } + + workflow { + """ + validate_params = true + input[0] = workflow + input[1] = validate_params + input[2] = "${projectDir}/subworkflows/nf-core/utils_nfschema_plugin/tests/nextflow_schema.json" + input[3] = true + input[4] = false + input[5] = false + input[6] = "Before" + input[7] = "After" + input[8] = "nextflow run test/test" + """ + } + } + + then { + assertAll( + { assert workflow.success } + ) + } + } } diff --git a/hello-nf-core/solutions/core-hello-part4/subworkflows/nf-core/utils_nfschema_plugin/tests/nextflow.config b/hello-nf-core/solutions/core-hello-part4/subworkflows/nf-core/utils_nfschema_plugin/tests/nextflow.config index 478fb8a05f..8d8c73718a 100644 --- a/hello-nf-core/solutions/core-hello-part4/subworkflows/nf-core/utils_nfschema_plugin/tests/nextflow.config +++ b/hello-nf-core/solutions/core-hello-part4/subworkflows/nf-core/utils_nfschema_plugin/tests/nextflow.config @@ -1,5 +1,5 @@ plugins { - id "nf-schema@2.1.0" + id "nf-schema@2.5.1" } validation { diff --git a/hello-nf-core/solutions/core-hello-part4/workflows/hello.nf b/hello-nf-core/solutions/core-hello-part4/workflows/hello.nf index 8d9b97f566..1993686608 100644 --- a/hello-nf-core/solutions/core-hello-part4/workflows/hello.nf +++ b/hello-nf-core/solutions/core-hello-part4/workflows/hello.nf @@ -20,26 +20,28 @@ workflow HELLO { take: ch_samplesheet // channel: samplesheet read in from --input + main: + ch_versions = Channel.empty() + // emit a greeting sayHello(ch_samplesheet) // convert the greeting to uppercase convertToUpper(sayHello.out) - // collect all the greetings into one file using nf-core cat/cat module // create metadata map with batch name as the ID def cat_meta = [ id: params.batch ] + // create a channel with metadata and files in tuple format ch_for_cat = convertToUpper.out.collect().map { files -> tuple(cat_meta, files) } + // concatenate files using the nf-core cat/cat module CAT_CAT(ch_for_cat) // generate ASCII art of the greetings with cowpy cowpy(CAT_CAT.out.file_out) - ch_versions = Channel.empty() - // // Collate and save software versions // diff --git a/hello-nf-core/solutions/core-hello-part5/.nf-core.yml b/hello-nf-core/solutions/core-hello-part5/.nf-core.yml new file mode 100644 index 0000000000..1a638d8288 --- /dev/null +++ b/hello-nf-core/solutions/core-hello-part5/.nf-core.yml @@ -0,0 +1,106 @@ +repository_type: pipeline + +nf_core_version: 3.4.1 + +lint: + files_unchanged: + - .github/CONTRIBUTING.md + - .prettierignore + - .prettierignore + - .prettierignore + - CODE_OF_CONDUCT.md + - assets/nf-core-hello_logo_light.png + - docs/images/nf-core-hello_logo_light.png + - docs/images/nf-core-hello_logo_dark.png + - .github/ISSUE_TEMPLATE/bug_report.yml + - .github/CONTRIBUTING.md + - .github/PULL_REQUEST_TEMPLATE.md + - assets/email_template.txt + - docs/README.md + - .github/ISSUE_TEMPLATE/bug_report.yml + - .github/ISSUE_TEMPLATE/config.yml + - .github/ISSUE_TEMPLATE/feature_request.yml + - .github/PULL_REQUEST_TEMPLATE.md + - .github/workflows/branch.yml + - .github/workflows/linting_comment.yml + - .github/workflows/linting.yml + - .github/CONTRIBUTING.md + - .github/.dockstore.yml + - .github/CONTRIBUTING.md + - assets/sendmail_template.txt + - .prettierignore + - LICENSE + nextflow_config: + - manifest.name + - manifest.homePage + nf_test_content: false + multiqc_config: false + files_exist: + - .github/workflows/nf-test.yml + - .github/actions/get-shards/action.yml + - .github/actions/nf-test/action.yml + - nf-test.config + - tests/default.nf.test + - assets/email_template.html + - assets/sendmail_template.txt + - assets/email_template.txt + - CODE_OF_CONDUCT.md + - assets/nf-core-hello_logo_light.png + - docs/images/nf-core-hello_logo_light.png + - docs/images/nf-core-hello_logo_dark.png + - .github/ISSUE_TEMPLATE/config.yml + - .github/workflows/awstest.yml + - .github/workflows/awsfulltest.yml + - .github/ISSUE_TEMPLATE/bug_report.yml + - .github/ISSUE_TEMPLATE/feature_request.yml + - .github/PULL_REQUEST_TEMPLATE.md + - .github/CONTRIBUTING.md + - .github/.dockstore.yml + - CHANGELOG.md + - assets/multiqc_config.yml + - .github/workflows/branch.yml + - .github/workflows/nf-test.yml + - .github/actions/get-shards/action.yml + - .github/actions/nf-test/action.yml + - .github/workflows/linting_comment.yml + - .github/workflows/linting.yml + - .prettierignore + - .prettierrc.yml + - conf/igenomes.config + - conf/igenomes_ignored.config + - CITATIONS.md + - LICENSE + readme: + - nextflow_badge + - nextflow_badge + - nfcore_template_badge + +template: + org: core + name: hello + description: A basic nf-core style version of Hello Nextflow + author: pinin4fjords + version: 1.0.0dev + force: true + outdir: . + skip_features: + - github + - github_badges + - changelog + - license + - ci + - nf-test + - igenomes + - multiqc + - fastqc + - seqera_platform + - gpu + - codespaces + - vscode + - code_linters + - citations + - rocrate + - email + - adaptivecard + - slackreport + is_nfcore: false diff --git a/hello-nf-core/solutions/core-hello-part5/README.md b/hello-nf-core/solutions/core-hello-part5/README.md new file mode 100644 index 0000000000..94844f6c5e --- /dev/null +++ b/hello-nf-core/solutions/core-hello-part5/README.md @@ -0,0 +1,75 @@ +# core/hello + +## Introduction + +**core/hello** is a bioinformatics pipeline that ... + + + + + + +## Usage + +> [!NOTE] +> If you are new to Nextflow and nf-core, please refer to [this page](https://nf-co.re/docs/usage/installation) on how to set-up Nextflow. Make sure to [test your setup](https://nf-co.re/docs/usage/introduction#how-to-run-a-pipeline) with `-profile test` before running the workflow on actual data. + + + +Now, you can run the pipeline using: + + + +```bash +nextflow run core/hello \ + -profile \ + --input samplesheet.csv \ + --outdir +``` + +> [!WARNING] +> Please provide pipeline parameters via the CLI or Nextflow `-params-file` option. Custom config files including those provided by the `-c` Nextflow option can be used to provide any configuration _**except for parameters**_; see [docs](https://nf-co.re/docs/usage/getting_started/configuration#custom-configuration-files). + +## Credits + +core/hello was originally written by pinin4fjords. + +We thank the following people for their extensive assistance in the development of this pipeline: + + + +## Contributions and Support + +If you would like to contribute to this pipeline, please see the [contributing guidelines](.github/CONTRIBUTING.md). + +## Citations + + + + +This pipeline uses code and infrastructure developed and maintained by the [nf-core](https://nf-co.re) community, reused here under the [MIT license](https://github.com/nf-core/tools/blob/main/LICENSE). + +> **The nf-core framework for community-curated bioinformatics pipelines.** +> +> Philip Ewels, Alexander Peltzer, Sven Fillinger, Harshil Patel, Johannes Alneberg, Andreas Wilm, Maxime Ulysse Garcia, Paolo Di Tommaso & Sven Nahnsen. +> +> _Nat Biotechnol._ 2020 Feb 13. doi: [10.1038/s41587-020-0439-x](https://dx.doi.org/10.1038/s41587-020-0439-x). diff --git a/hello-nf-core/solutions/core-hello-part5/assets/greetings.csv b/hello-nf-core/solutions/core-hello-part5/assets/greetings.csv new file mode 100644 index 0000000000..f5c9849604 --- /dev/null +++ b/hello-nf-core/solutions/core-hello-part5/assets/greetings.csv @@ -0,0 +1,4 @@ +greeting +Hello +Bonjour +Holà diff --git a/hello-nf-core/solutions/core-hello-part5/assets/samplesheet.csv b/hello-nf-core/solutions/core-hello-part5/assets/samplesheet.csv new file mode 100644 index 0000000000..5f653ab7bf --- /dev/null +++ b/hello-nf-core/solutions/core-hello-part5/assets/samplesheet.csv @@ -0,0 +1,3 @@ +sample,fastq_1,fastq_2 +SAMPLE_PAIRED_END,/path/to/fastq/files/AEG588A1_S1_L002_R1_001.fastq.gz,/path/to/fastq/files/AEG588A1_S1_L002_R2_001.fastq.gz +SAMPLE_SINGLE_END,/path/to/fastq/files/AEG588A4_S4_L003_R1_001.fastq.gz, diff --git a/hello-nf-core/solutions/core-hello-part5/assets/schema_input.json b/hello-nf-core/solutions/core-hello-part5/assets/schema_input.json new file mode 100644 index 0000000000..75a1d06d17 --- /dev/null +++ b/hello-nf-core/solutions/core-hello-part5/assets/schema_input.json @@ -0,0 +1,18 @@ +{ + "$schema": "https://json-schema.org/draft/2020-12/schema", + "$id": "https://raw.githubusercontent.com/core/hello/main/assets/schema_input.json", + "title": "core/hello pipeline - params.input schema", + "description": "Schema for the file provided with params.input", + "type": "array", + "items": { + "type": "object", + "properties": { + "greeting": { + "type": "string", + "pattern": "^\\S.*$", + "errorMessage": "Greeting must be provided and cannot be empty or start with whitespace" + } + }, + "required": ["greeting"] + } +} diff --git a/hello-nf-core/solutions/core-hello-part5/conf/base.config b/hello-nf-core/solutions/core-hello-part5/conf/base.config new file mode 100644 index 0000000000..e0fe40762f --- /dev/null +++ b/hello-nf-core/solutions/core-hello-part5/conf/base.config @@ -0,0 +1,66 @@ +/* +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + core/hello Nextflow base config file +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + A 'blank slate' config file, appropriate for general use on most high performance + compute environments. Assumes that all software is installed and available on + the PATH. Runs in `local` mode - all jobs will be run on the logged in environment. +---------------------------------------------------------------------------------------- +*/ + +process { + + // TODO nf-core: Check the defaults for all processes + cpus = { 1 * task.attempt } + memory = { 6.GB * task.attempt } + time = { 4.h * task.attempt } + + errorStrategy = { task.exitStatus in ((130..145) + 104 + 175) ? 'retry' : 'finish' } + maxRetries = 1 + maxErrors = '-1' + + // Process-specific resource requirements + // NOTE - Please try and reuse the labels below as much as possible. + // These labels are used and recognised by default in DSL2 files hosted on nf-core/modules. + // If possible, it would be nice to keep the same label naming convention when + // adding in your local modules too. + // TODO nf-core: Customise requirements for specific processes. + // See https://www.nextflow.io/docs/latest/config.html#config-process-selectors + withLabel:process_single { + cpus = { 1 } + memory = { 6.GB * task.attempt } + time = { 4.h * task.attempt } + } + withLabel:process_low { + cpus = { 2 * task.attempt } + memory = { 12.GB * task.attempt } + time = { 4.h * task.attempt } + } + withLabel:process_medium { + cpus = { 6 * task.attempt } + memory = { 36.GB * task.attempt } + time = { 8.h * task.attempt } + } + withLabel:process_high { + cpus = { 12 * task.attempt } + memory = { 72.GB * task.attempt } + time = { 16.h * task.attempt } + } + withLabel:process_long { + time = { 20.h * task.attempt } + } + withLabel:process_high_memory { + memory = { 200.GB * task.attempt } + } + withLabel:error_ignore { + errorStrategy = 'ignore' + } + withLabel:error_retry { + errorStrategy = 'retry' + maxRetries = 2 + } + withLabel: process_gpu { + ext.use_gpu = { workflow.profile.contains('gpu') } + accelerator = { workflow.profile.contains('gpu') ? 1 : null } + } +} diff --git a/hello-nf-core/solutions/core-hello-part5/conf/modules.config b/hello-nf-core/solutions/core-hello-part5/conf/modules.config new file mode 100644 index 0000000000..51b19b4a1e --- /dev/null +++ b/hello-nf-core/solutions/core-hello-part5/conf/modules.config @@ -0,0 +1,26 @@ +/* +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + Config file for defining DSL2 per module options and publishing paths +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + Available keys to override module options: + ext.args = Additional arguments appended to command in module. + ext.args2 = Second set of arguments appended to command in module (multi-tool modules). + ext.args3 = Third set of arguments appended to command in module (multi-tool modules). + ext.prefix = File name prefix for output files. +---------------------------------------------------------------------------------------- +*/ + +process { + + publishDir = [ + path: { "${params.outdir}/${task.process.tokenize(':')[-1].tokenize('_')[0].toLowerCase()}" }, + mode: params.publish_dir_mode, + saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + ] + + withName: 'cowpy' { + ext.args = { "-c ${params.character}" } + ext.prefix = { "cowpy-${meta.id}" } + } + +} diff --git a/hello-nf-core/solutions/core-hello-part5/conf/test.config b/hello-nf-core/solutions/core-hello-part5/conf/test.config new file mode 100644 index 0000000000..13ecf2ad4b --- /dev/null +++ b/hello-nf-core/solutions/core-hello-part5/conf/test.config @@ -0,0 +1,31 @@ +/* +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + Nextflow config file for running minimal tests +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + Defines input files and everything required to run a fast and simple pipeline test. + + Use as follows: + nextflow run core/hello -profile test, --outdir + +---------------------------------------------------------------------------------------- +*/ + +process { + resourceLimits = [ + cpus: 2, + memory: '4.GB', + time: '1.h' + ] +} + +params { + config_profile_name = 'Test profile' + config_profile_description = 'Minimal test dataset to check pipeline function' + + // Input data + input = "${projectDir}/assets/greetings.csv" + + // Other parameters + batch = 'test' + character = 'tux' +} diff --git a/hello-nf-core/solutions/core-hello-part5/conf/test_full.config b/hello-nf-core/solutions/core-hello-part5/conf/test_full.config new file mode 100644 index 0000000000..ceeaf40cac --- /dev/null +++ b/hello-nf-core/solutions/core-hello-part5/conf/test_full.config @@ -0,0 +1,24 @@ +/* +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + Nextflow config file for running full-size tests +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + Defines input files and everything required to run a full size pipeline test. + + Use as follows: + nextflow run core/hello -profile test_full, --outdir + +---------------------------------------------------------------------------------------- +*/ + +params { + config_profile_name = 'Full test profile' + config_profile_description = 'Full test dataset to check pipeline function' + + // Input data for full size test + // TODO nf-core: Specify the paths to your full test data ( on nf-core/test-datasets or directly in repositories, e.g. SRA) + // TODO nf-core: Give any required params for the test so that command line flags are not needed + input = params.pipelines_testdata_base_path + 'viralrecon/samplesheet/samplesheet_full_illumina_amplicon.csv' + + // Fasta references + fasta = params.pipelines_testdata_base_path + 'viralrecon/genome/NC_045512.2/GCF_009858895.2_ASM985889v3_genomic.200409.fna.gz' +} diff --git a/hello-nf-core/solutions/core-hello-part5/docs/README.md b/hello-nf-core/solutions/core-hello-part5/docs/README.md new file mode 100644 index 0000000000..593e4a39e8 --- /dev/null +++ b/hello-nf-core/solutions/core-hello-part5/docs/README.md @@ -0,0 +1,8 @@ +# core/hello: Documentation + +The core/hello documentation is split into the following pages: + +- [Usage](usage.md) + - An overview of how the pipeline works, how to run it and a description of all of the different command-line flags. +- [Output](output.md) + - An overview of the different results produced by the pipeline and how to interpret them. diff --git a/hello-nf-core/solutions/core-hello-part5/docs/output.md b/hello-nf-core/solutions/core-hello-part5/docs/output.md new file mode 100644 index 0000000000..7a49820c81 --- /dev/null +++ b/hello-nf-core/solutions/core-hello-part5/docs/output.md @@ -0,0 +1,29 @@ +# core/hello: Output + +## Introduction + +This document describes the output produced by the pipeline. + +The directories listed below will be created in the results directory after the pipeline has finished. All paths are relative to the top-level results directory. + + + +## Pipeline overview + +The pipeline is built using [Nextflow](https://www.nextflow.io/) and processes data using the following steps: + +- [Pipeline information](#pipeline-information) - Report metrics generated during the workflow execution + +### Pipeline information + +
+Output files + +- `pipeline_info/` + - Reports generated by Nextflow: `execution_report.html`, `execution_timeline.html`, `execution_trace.txt` and `pipeline_dag.dot`/`pipeline_dag.svg`. + - Reformatted samplesheet files used as input to the pipeline: `samplesheet.valid.csv`. + - Parameters used by the pipeline run: `params.json`. + +
+ +[Nextflow](https://www.nextflow.io/docs/latest/tracing.html) provides excellent functionality for generating various reports relevant to the running and execution of the pipeline. This will allow you to troubleshoot errors with the running of the pipeline, and also provide you with other information such as launch commands, run times and resource usage. diff --git a/hello-nf-core/solutions/core-hello-part5/docs/usage.md b/hello-nf-core/solutions/core-hello-part5/docs/usage.md new file mode 100644 index 0000000000..bfbc37ab42 --- /dev/null +++ b/hello-nf-core/solutions/core-hello-part5/docs/usage.md @@ -0,0 +1,211 @@ +# core/hello: Usage + +> _Documentation of pipeline parameters is generated automatically from the pipeline schema and can no longer be found in markdown files._ + +## Introduction + + + +## Samplesheet input + +You will need to create a samplesheet with information about the samples you would like to analyse before running the pipeline. Use this parameter to specify its location. It has to be a comma-separated file with 3 columns, and a header row as shown in the examples below. + +```bash +--input '[path to samplesheet file]' +``` + +### Multiple runs of the same sample + +The `sample` identifiers have to be the same when you have re-sequenced the same sample more than once e.g. to increase sequencing depth. The pipeline will concatenate the raw reads before performing any downstream analysis. Below is an example for the same sample sequenced across 3 lanes: + +```csv title="samplesheet.csv" +sample,fastq_1,fastq_2 +CONTROL_REP1,AEG588A1_S1_L002_R1_001.fastq.gz,AEG588A1_S1_L002_R2_001.fastq.gz +CONTROL_REP1,AEG588A1_S1_L003_R1_001.fastq.gz,AEG588A1_S1_L003_R2_001.fastq.gz +CONTROL_REP1,AEG588A1_S1_L004_R1_001.fastq.gz,AEG588A1_S1_L004_R2_001.fastq.gz +``` + +### Full samplesheet + +The pipeline will auto-detect whether a sample is single- or paired-end using the information provided in the samplesheet. The samplesheet can have as many columns as you desire, however, there is a strict requirement for the first 3 columns to match those defined in the table below. + +A final samplesheet file consisting of both single- and paired-end data may look something like the one below. This is for 6 samples, where `TREATMENT_REP3` has been sequenced twice. + +```csv title="samplesheet.csv" +sample,fastq_1,fastq_2 +CONTROL_REP1,AEG588A1_S1_L002_R1_001.fastq.gz,AEG588A1_S1_L002_R2_001.fastq.gz +CONTROL_REP2,AEG588A2_S2_L002_R1_001.fastq.gz,AEG588A2_S2_L002_R2_001.fastq.gz +CONTROL_REP3,AEG588A3_S3_L002_R1_001.fastq.gz,AEG588A3_S3_L002_R2_001.fastq.gz +TREATMENT_REP1,AEG588A4_S4_L003_R1_001.fastq.gz, +TREATMENT_REP2,AEG588A5_S5_L003_R1_001.fastq.gz, +TREATMENT_REP3,AEG588A6_S6_L003_R1_001.fastq.gz, +TREATMENT_REP3,AEG588A6_S6_L004_R1_001.fastq.gz, +``` + +| Column | Description | +| --------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| `sample` | Custom sample name. This entry will be identical for multiple sequencing libraries/runs from the same sample. Spaces in sample names are automatically converted to underscores (`_`). | +| `fastq_1` | Full path to FastQ file for Illumina short reads 1. File has to be gzipped and have the extension ".fastq.gz" or ".fq.gz". | +| `fastq_2` | Full path to FastQ file for Illumina short reads 2. File has to be gzipped and have the extension ".fastq.gz" or ".fq.gz". | + +An [example samplesheet](../assets/samplesheet.csv) has been provided with the pipeline. + +## Running the pipeline + +The typical command for running the pipeline is as follows: + +```bash +nextflow run core/hello --input ./samplesheet.csv --outdir ./results -profile docker +``` + +This will launch the pipeline with the `docker` configuration profile. See below for more information about profiles. + +Note that the pipeline will create the following files in your working directory: + +```bash +work # Directory containing the nextflow working files + # Finished results in specified location (defined with --outdir) +.nextflow_log # Log file from Nextflow +# Other nextflow hidden files, eg. history of pipeline runs and old logs. +``` + +If you wish to repeatedly use the same parameters for multiple runs, rather than specifying each flag in the command, you can specify these in a params file. + +Pipeline settings can be provided in a `yaml` or `json` file via `-params-file `. + +> [!WARNING] +> Do not use `-c ` to specify parameters as this will result in errors. Custom config files specified with `-c` must only be used for [tuning process resource specifications](https://nf-co.re/docs/usage/configuration#tuning-workflow-resources), other infrastructural tweaks (such as output directories), or module arguments (args). + +The above pipeline run specified with a params file in yaml format: + +```bash +nextflow run core/hello -profile docker -params-file params.yaml +``` + +with: + +```yaml title="params.yaml" +input: './samplesheet.csv' +outdir: './results/' +<...> +``` + +You can also generate such `YAML`/`JSON` files via [nf-core/launch](https://nf-co.re/launch). + +### Updating the pipeline + +When you run the above command, Nextflow automatically pulls the pipeline code from GitHub and stores it as a cached version. When running the pipeline after this, it will always use the cached version if available - even if the pipeline has been updated since. To make sure that you're running the latest version of the pipeline, make sure that you regularly update the cached version of the pipeline: + +```bash +nextflow pull core/hello +``` + +### Reproducibility + +It is a good idea to specify the pipeline version when running the pipeline on your data. This ensures that a specific version of the pipeline code and software are used when you run your pipeline. If you keep using the same tag, you'll be running the same version of the pipeline, even if there have been changes to the code since. + +First, go to the [core/hello releases page](https://github.com/core/hello/releases) and find the latest pipeline version - numeric only (eg. `1.3.1`). Then specify this when running the pipeline with `-r` (one hyphen) - eg. `-r 1.3.1`. Of course, you can switch to another version by changing the number after the `-r` flag. + +This version number will be logged in reports when you run the pipeline, so that you'll know what you used when you look back in the future. + +To further assist in reproducibility, you can use share and reuse [parameter files](#running-the-pipeline) to repeat pipeline runs with the same settings without having to write out a command with every single parameter. + +> [!TIP] +> If you wish to share such profile (such as upload as supplementary material for academic publications), make sure to NOT include cluster specific paths to files, nor institutional specific profiles. + +## Core Nextflow arguments + +> [!NOTE] +> These options are part of Nextflow and use a _single_ hyphen (pipeline parameters use a double-hyphen) + +### `-profile` + +Use this parameter to choose a configuration profile. Profiles can give configuration presets for different compute environments. + +Several generic profiles are bundled with the pipeline which instruct the pipeline to use software packaged using different methods (Docker, Singularity, Podman, Shifter, Charliecloud, Apptainer, Conda) - see below. + +> [!IMPORTANT] +> We highly recommend the use of Docker or Singularity containers for full pipeline reproducibility, however when this is not possible, Conda is also supported. + +The pipeline also dynamically loads configurations from [https://github.com/nf-core/configs](https://github.com/nf-core/configs) when it runs, making multiple config profiles for various institutional clusters available at run time. For more information and to check if your system is supported, please see the [nf-core/configs documentation](https://github.com/nf-core/configs#documentation). + +Note that multiple profiles can be loaded, for example: `-profile test,docker` - the order of arguments is important! +They are loaded in sequence, so later profiles can overwrite earlier profiles. + +If `-profile` is not specified, the pipeline will run locally and expect all software to be installed and available on the `PATH`. This is _not_ recommended, since it can lead to different results on different machines dependent on the computer environment. + +- `test` + - A profile with a complete configuration for automated testing + - Includes links to test data so needs no other parameters +- `docker` + - A generic configuration profile to be used with [Docker](https://docker.com/) +- `singularity` + - A generic configuration profile to be used with [Singularity](https://sylabs.io/docs/) +- `podman` + - A generic configuration profile to be used with [Podman](https://podman.io/) +- `shifter` + - A generic configuration profile to be used with [Shifter](https://nersc.gitlab.io/development/shifter/how-to-use/) +- `charliecloud` + - A generic configuration profile to be used with [Charliecloud](https://charliecloud.io/) +- `apptainer` + - A generic configuration profile to be used with [Apptainer](https://apptainer.org/) +- `wave` + - A generic configuration profile to enable [Wave](https://seqera.io/wave/) containers. Use together with one of the above (requires Nextflow ` 24.03.0-edge` or later). +- `conda` + - A generic configuration profile to be used with [Conda](https://conda.io/docs/). Please only use Conda as a last resort i.e. when it's not possible to run the pipeline with Docker, Singularity, Podman, Shifter, Charliecloud, or Apptainer. + +### `-resume` + +Specify this when restarting a pipeline. Nextflow will use cached results from any pipeline steps where the inputs are the same, continuing from where it got to previously. For input to be considered the same, not only the names must be identical but the files' contents as well. For more info about this parameter, see [this blog post](https://www.nextflow.io/blog/2019/demystifying-nextflow-resume.html). + +You can also supply a run name to resume a specific run: `-resume [run-name]`. Use the `nextflow log` command to show previous run names. + +### `-c` + +Specify the path to a specific config file (this is a core Nextflow command). See the [nf-core website documentation](https://nf-co.re/usage/configuration) for more information. + +## Custom configuration + +### Resource requests + +Whilst the default requirements set within the pipeline will hopefully work for most people and with most input data, you may find that you want to customise the compute resources that the pipeline requests. Each step in the pipeline has a default set of requirements for number of CPUs, memory and time. For most of the pipeline steps, if the job exits with any of the error codes specified [here](https://github.com/nf-core/rnaseq/blob/4c27ef5610c87db00c3c5a3eed10b1d161abf575/conf/base.config#L18) it will automatically be resubmitted with higher resources request (2 x original, then 3 x original). If it still fails after the third attempt then the pipeline execution is stopped. + +To change the resource requests, please see the [max resources](https://nf-co.re/docs/usage/configuration#max-resources) and [tuning workflow resources](https://nf-co.re/docs/usage/configuration#tuning-workflow-resources) section of the nf-core website. + +### Custom Containers + +In some cases, you may wish to change the container or conda environment used by a pipeline steps for a particular tool. By default, nf-core pipelines use containers and software from the [biocontainers](https://biocontainers.pro/) or [bioconda](https://bioconda.github.io/) projects. However, in some cases the pipeline specified version maybe out of date. + +To use a different container from the default container or conda environment specified in a pipeline, please see the [updating tool versions](https://nf-co.re/docs/usage/configuration#updating-tool-versions) section of the nf-core website. + +### Custom Tool Arguments + +A pipeline might not always support every possible argument or option of a particular tool used in pipeline. Fortunately, nf-core pipelines provide some freedom to users to insert additional parameters that the pipeline does not include by default. + +To learn how to provide additional arguments to a particular tool of the pipeline, please see the [customising tool arguments](https://nf-co.re/docs/usage/configuration#customising-tool-arguments) section of the nf-core website. + +### nf-core/configs + +In most cases, you will only need to create a custom config as a one-off but if you and others within your organisation are likely to be running nf-core pipelines regularly and need to use the same settings regularly it may be a good idea to request that your custom config file is uploaded to the `nf-core/configs` git repository. Before you do this please can you test that the config file works with your pipeline of choice using the `-c` parameter. You can then create a pull request to the `nf-core/configs` repository with the addition of your config file, associated documentation file (see examples in [`nf-core/configs/docs`](https://github.com/nf-core/configs/tree/master/docs)), and amending [`nfcore_custom.config`](https://github.com/nf-core/configs/blob/master/nfcore_custom.config) to include your custom profile. + +See the main [Nextflow documentation](https://www.nextflow.io/docs/latest/config.html) for more information about creating your own configuration files. + +If you have any questions or issues please send us a message on [Slack](https://nf-co.re/join/slack) on the [`#configs` channel](https://nfcore.slack.com/channels/configs). + +## Running in the background + +Nextflow handles job submissions and supervises the running jobs. The Nextflow process must run until the pipeline is finished. + +The Nextflow `-bg` flag launches Nextflow in the background, detached from your terminal so that the workflow does not stop if you log out of your session. The logs are saved to a file. + +Alternatively, you can use `screen` / `tmux` or similar tool to create a detached session which you can log back into at a later time. +Some HPC setups also allow you to run nextflow within a cluster job submitted your job scheduler (from where it submits more jobs). + +## Nextflow memory requirements + +In some cases, the Nextflow Java virtual machines can start to request a large amount of memory. +We recommend adding the following line to your environment to limit this (typically in `~/.bashrc` or `~./bash_profile`): + +```bash +NXF_OPTS='-Xms1g -Xmx4g' +``` diff --git a/hello-nf-core/solutions/core-hello-part5/main.nf b/hello-nf-core/solutions/core-hello-part5/main.nf new file mode 100644 index 0000000000..eb8d91361f --- /dev/null +++ b/hello-nf-core/solutions/core-hello-part5/main.nf @@ -0,0 +1,85 @@ +#!/usr/bin/env nextflow +/* +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + core/hello +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + Github : https://github.com/core/hello +---------------------------------------------------------------------------------------- +*/ + +/* +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + IMPORT FUNCTIONS / MODULES / SUBWORKFLOWS / WORKFLOWS +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +*/ + +include { HELLO } from './workflows/hello' +include { PIPELINE_INITIALISATION } from './subworkflows/local/utils_nfcore_hello_pipeline' +include { PIPELINE_COMPLETION } from './subworkflows/local/utils_nfcore_hello_pipeline' +/* +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + NAMED WORKFLOWS FOR PIPELINE +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +*/ + +// +// WORKFLOW: Run main analysis pipeline depending on type of input +// +workflow CORE_HELLO { + + take: + samplesheet // channel: samplesheet read in from --input + + main: + + // + // WORKFLOW: Run pipeline + // + HELLO ( + samplesheet + ) +} +/* +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + RUN MAIN WORKFLOW +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +*/ + +workflow { + + main: + // + // SUBWORKFLOW: Run initialisation tasks + // + PIPELINE_INITIALISATION ( + params.version, + params.validate_params, + params.monochrome_logs, + args, + params.outdir, + params.input, + params.help, + params.help_full, + params.show_hidden + ) + + // + // WORKFLOW: Run main workflow + // + CORE_HELLO ( + PIPELINE_INITIALISATION.out.samplesheet + ) + // + // SUBWORKFLOW: Run completion tasks + // + PIPELINE_COMPLETION ( + params.outdir, + params.monochrome_logs, + ) +} + +/* +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + THE END +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +*/ diff --git a/hello-nf-core/solutions/core-hello-part5/modules.json b/hello-nf-core/solutions/core-hello-part5/modules.json new file mode 100644 index 0000000000..71a7815a6b --- /dev/null +++ b/hello-nf-core/solutions/core-hello-part5/modules.json @@ -0,0 +1,36 @@ +{ + "name": "core/hello", + "homePage": "https://github.com/core/hello", + "repos": { + "https://github.com/nf-core/modules.git": { + "modules": { + "nf-core": { + "cat/cat": { + "branch": "master", + "git_sha": "41dfa3f7c0ffabb96a6a813fe321c6d1cc5b6e46", + "installed_by": ["modules"] + } + } + }, + "subworkflows": { + "nf-core": { + "utils_nextflow_pipeline": { + "branch": "master", + "git_sha": "05954dab2ff481bcb999f24455da29a5828af08d", + "installed_by": ["subworkflows"] + }, + "utils_nfcore_pipeline": { + "branch": "master", + "git_sha": "05954dab2ff481bcb999f24455da29a5828af08d", + "installed_by": ["subworkflows"] + }, + "utils_nfschema_plugin": { + "branch": "master", + "git_sha": "4b406a74dc0449c0401ed87d5bfff4252fd277fd", + "installed_by": ["subworkflows"] + } + } + } + } + } +} diff --git a/hello-nf-core/solutions/core-hello-part5/modules/local/convertToUpper.nf b/hello-nf-core/solutions/core-hello-part5/modules/local/convertToUpper.nf new file mode 100644 index 0000000000..b2689e8e9c --- /dev/null +++ b/hello-nf-core/solutions/core-hello-part5/modules/local/convertToUpper.nf @@ -0,0 +1,20 @@ +#!/usr/bin/env nextflow + +/* + * Use a text replacement tool to convert the greeting to uppercase + */ +process convertToUpper { + + publishDir 'results', mode: 'copy' + + input: + path input_file + + output: + path "UPPER-${input_file}" + + script: + """ + cat '$input_file' | tr '[a-z]' '[A-Z]' > 'UPPER-${input_file}' + """ +} diff --git a/hello-nf-core/solutions/core-hello-part5/modules/local/cowpy.nf b/hello-nf-core/solutions/core-hello-part5/modules/local/cowpy.nf new file mode 100644 index 0000000000..ec00520a66 --- /dev/null +++ b/hello-nf-core/solutions/core-hello-part5/modules/local/cowpy.nf @@ -0,0 +1,21 @@ +#!/usr/bin/env nextflow + +// Generate ASCII art with cowpy (https://github.com/jeffbuttars/cowpy) +process cowpy { + + container 'community.wave.seqera.io/library/cowpy:1.1.5--3db457ae1977a273' + conda 'conda-forge::cowpy==1.1.5' + + input: + tuple val(meta), path(input_file) + + output: + tuple val(meta), path("${prefix}.txt"), emit: cowpy_output + + script: + def args = task.ext.args ?: '' + prefix = task.ext.prefix ?: "${meta.id}" + """ + cat $input_file | cowpy $args > ${prefix}.txt + """ +} diff --git a/hello-nf-core/solutions/core-hello-part5/modules/local/cowpy/environment.yml b/hello-nf-core/solutions/core-hello-part5/modules/local/cowpy/environment.yml new file mode 100644 index 0000000000..32bc330d8f --- /dev/null +++ b/hello-nf-core/solutions/core-hello-part5/modules/local/cowpy/environment.yml @@ -0,0 +1,10 @@ +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json +channels: + - conda-forge + - bioconda +dependencies: + # TODO nf-core: List required Conda package(s). + # Software MUST be pinned to channel (i.e. "bioconda"), version (i.e. "1.10"). + # For Conda, the build (i.e. "h9402c20_2") must be EXCLUDED to support installation on different operating systems. + - "YOUR-TOOL-HERE" diff --git a/hello-nf-core/solutions/core-hello-part5/modules/local/cowpy/main.nf b/hello-nf-core/solutions/core-hello-part5/modules/local/cowpy/main.nf new file mode 100644 index 0000000000..1b30633471 --- /dev/null +++ b/hello-nf-core/solutions/core-hello-part5/modules/local/cowpy/main.nf @@ -0,0 +1,47 @@ + + +process COWPY { + tag "$meta.id" + label 'process_single' + + conda "${moduleDir}/environment.yml" + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/YOUR-TOOL-HERE': + 'biocontainers/YOUR-TOOL-HERE' }" + + input:tuple val(meta), path(input) + + output: + tuple val(meta), path("*"), emit: output + path "versions.yml" , emit: versions + + when: + task.ext.when == null || task.ext.when + + script: + def args = task.ext.args ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" + + """ + + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + cowpy: \$(cowpy --version) + END_VERSIONS + """ + + stub: + def args = task.ext.args ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" + + """ + echo $args + + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + cowpy: \$(cowpy --version) + END_VERSIONS + """ +} diff --git a/hello-nf-core/solutions/core-hello-part5/modules/local/cowpy/meta.yml b/hello-nf-core/solutions/core-hello-part5/modules/local/cowpy/meta.yml new file mode 100644 index 0000000000..616fdd9422 --- /dev/null +++ b/hello-nf-core/solutions/core-hello-part5/modules/local/cowpy/meta.yml @@ -0,0 +1,51 @@ +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/meta-schema.json +name: "cowpy" +description: write your description here +keywords: + - sort + - example + - genomics +tools: + - "cowpy": + description: "" + homepage: "" + documentation: "" + tool_dev_url: "" + doi: "" + licence: null + identifier: null + +input: + - - meta: + type: map + description: Groovy Map containing sample information. e.g. `[ + id:'sample1' ]` + - input: + type: file + description: "" + pattern: "" + ontologies: + - edam: "" +output: + output: + - - meta: + type: map + description: Groovy Map containing sample information. e.g. `[ + id:'sample1' ]` + - "*": + type: file + description: "" + pattern: "" + ontologies: + - edam: "" + versions: + - versions.yml: + type: file + description: File containing software versions + pattern: versions.yml + ontologies: + - edam: http://edamontology.org/format_3750 # YAML +authors: + - "@example" +maintainers: + - "@example" diff --git a/hello-nf-core/solutions/core-hello-part5/modules/local/cowpy/tests/main.nf.test b/hello-nf-core/solutions/core-hello-part5/modules/local/cowpy/tests/main.nf.test new file mode 100644 index 0000000000..e3fcf90a5d --- /dev/null +++ b/hello-nf-core/solutions/core-hello-part5/modules/local/cowpy/tests/main.nf.test @@ -0,0 +1,78 @@ +// TODO nf-core: Once you have added the required tests, please run the following command to build this file: +// nf-core modules test cowpy +nextflow_process { + + name "Test Process COWPY" + script "../main.nf" + process "COWPY" + + tag "modules" + tag "modules_" + tag "cowpy" + + // TODO nf-core: Change the test name preferably indicating the test-data and file-format used + test("sarscov2 - bam") { + + // TODO nf-core: If you are created a test for a chained module + // (the module requires running more than one process to generate the required output) + // add the 'setup' method here. + // You can find more information about how to use a 'setup' method in the docs (https://nf-co.re/docs/contributing/modules#steps-for-creating-nf-test-for-chained-modules). + + when { + process { + """ + // TODO nf-core: define inputs of the process here. Example: + + input[0] = [ + [ id:'test' ], + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.paired_end.sorted.bam', checkIfExists: true), + ] + """ + } + } + + then { + assert process.success + assertAll( + { assert snapshot( + process.out, + path(process.out.versions[0]).yaml + ).match() } + //TODO nf-core: Add all required assertions to verify the test output. + // See https://nf-co.re/docs/contributing/tutorials/nf-test_assertions for more information and examples. + ) + } + + } + + // TODO nf-core: Change the test name preferably indicating the test-data and file-format used but keep the " - stub" suffix. + test("sarscov2 - bam - stub") { + + options "-stub" + + when { + process { + """ + // TODO nf-core: define inputs of the process here. Example: + + input[0] = [ + [ id:'test' ], + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.paired_end.sorted.bam', checkIfExists: true), + ] + """ + } + } + + then { + assert process.success + assertAll( + { assert snapshot( + process.out, + path(process.out.versions[0]).yaml + ).match() } + ) + } + + } + +} diff --git a/hello-nf-core/solutions/core-hello-part5/modules/local/sayHello.nf b/hello-nf-core/solutions/core-hello-part5/modules/local/sayHello.nf new file mode 100644 index 0000000000..6005ad54c9 --- /dev/null +++ b/hello-nf-core/solutions/core-hello-part5/modules/local/sayHello.nf @@ -0,0 +1,20 @@ +#!/usr/bin/env nextflow + +/* + * Use echo to print 'Hello World!' to a file + */ +process sayHello { + + publishDir 'results', mode: 'copy' + + input: + val greeting + + output: + path "${greeting}-output.txt" + + script: + """ + echo '$greeting' > '$greeting-output.txt' + """ +} diff --git a/hello-nf-core/solutions/core-hello-part5/modules/nf-core/cat/cat/environment.yml b/hello-nf-core/solutions/core-hello-part5/modules/nf-core/cat/cat/environment.yml new file mode 100644 index 0000000000..50c2059afb --- /dev/null +++ b/hello-nf-core/solutions/core-hello-part5/modules/nf-core/cat/cat/environment.yml @@ -0,0 +1,7 @@ +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json +channels: + - conda-forge + - bioconda +dependencies: + - conda-forge::pigz=2.3.4 diff --git a/hello-nf-core/solutions/core-hello-part5/modules/nf-core/cat/cat/main.nf b/hello-nf-core/solutions/core-hello-part5/modules/nf-core/cat/cat/main.nf new file mode 100644 index 0000000000..2862c64cd9 --- /dev/null +++ b/hello-nf-core/solutions/core-hello-part5/modules/nf-core/cat/cat/main.nf @@ -0,0 +1,78 @@ +process CAT_CAT { + tag "$meta.id" + label 'process_low' + + conda "${moduleDir}/environment.yml" + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/pigz:2.3.4' : + 'biocontainers/pigz:2.3.4' }" + + input: + tuple val(meta), path(files_in) + + output: + tuple val(meta), path("${prefix}"), emit: file_out + path "versions.yml" , emit: versions + + when: + task.ext.when == null || task.ext.when + + script: + def args = task.ext.args ?: '' + def args2 = task.ext.args2 ?: '' + def file_list = files_in.collect { it.toString() } + + // choose appropriate concatenation tool depending on input and output format + + // | input | output | command1 | command2 | + // |-----------|------------|----------|----------| + // | gzipped | gzipped | cat | | + // | ungzipped | ungzipped | cat | | + // | gzipped | ungzipped | zcat | | + // | ungzipped | gzipped | cat | pigz | + + // Use input file ending as default + prefix = task.ext.prefix ?: "${meta.id}${getFileSuffix(file_list[0])}" + out_zip = prefix.endsWith('.gz') + in_zip = file_list[0].endsWith('.gz') + command1 = (in_zip && !out_zip) ? 'zcat' : 'cat' + command2 = (!in_zip && out_zip) ? "| pigz -c -p $task.cpus $args2" : '' + if(file_list.contains(prefix.trim())) { + error "The name of the input file can't be the same as for the output prefix in the " + + "module CAT_CAT (currently `$prefix`). Please choose a different one." + } + """ + $command1 \\ + $args \\ + ${file_list.join(' ')} \\ + $command2 \\ + > ${prefix} + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + pigz: \$( pigz --version 2>&1 | sed 's/pigz //g' ) + END_VERSIONS + """ + + stub: + def file_list = files_in.collect { it.toString() } + prefix = task.ext.prefix ?: "${meta.id}${file_list[0].substring(file_list[0].lastIndexOf('.'))}" + if(file_list.contains(prefix.trim())) { + error "The name of the input file can't be the same as for the output prefix in the " + + "module CAT_CAT (currently `$prefix`). Please choose a different one." + } + """ + touch $prefix + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + pigz: \$( pigz --version 2>&1 | sed 's/pigz //g' ) + END_VERSIONS + """ +} + +// for .gz files also include the second to last extension if it is present. E.g., .fasta.gz +def getFileSuffix(filename) { + def match = filename =~ /^.*?((\.\w{1,5})?(\.\w{1,5}\.gz$))/ + return match ? match[0][1] : filename.substring(filename.lastIndexOf('.')) +} diff --git a/hello-nf-core/solutions/core-hello-part5/modules/nf-core/cat/cat/meta.yml b/hello-nf-core/solutions/core-hello-part5/modules/nf-core/cat/cat/meta.yml new file mode 100644 index 0000000000..2a9284d7f1 --- /dev/null +++ b/hello-nf-core/solutions/core-hello-part5/modules/nf-core/cat/cat/meta.yml @@ -0,0 +1,46 @@ +name: cat_cat +description: A module for concatenation of gzipped or uncompressed files +keywords: + - concatenate + - gzip + - cat +tools: + - cat: + description: Just concatenation + documentation: https://man7.org/linux/man-pages/man1/cat.1.html + licence: ["GPL-3.0-or-later"] + identifier: "" +input: + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - files_in: + type: file + description: List of compressed / uncompressed files + pattern: "*" + ontologies: [] +output: + file_out: + - - meta: + type: map + description: Groovy Map containing sample information + - ${prefix}: + type: file + description: Concatenated file. Will be gzipped if file_out ends with ".gz" + pattern: "${file_out}" + ontologies: [] + versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" + ontologies: + - edam: http://edamontology.org/format_3750 # YAML +authors: + - "@erikrikarddaniel" + - "@FriederikeHanssen" +maintainers: + - "@erikrikarddaniel" + - "@FriederikeHanssen" diff --git a/hello-nf-core/solutions/core-hello-part5/modules/nf-core/cat/cat/tests/main.nf.test b/hello-nf-core/solutions/core-hello-part5/modules/nf-core/cat/cat/tests/main.nf.test new file mode 100644 index 0000000000..9cb1617883 --- /dev/null +++ b/hello-nf-core/solutions/core-hello-part5/modules/nf-core/cat/cat/tests/main.nf.test @@ -0,0 +1,191 @@ +nextflow_process { + + name "Test Process CAT_CAT" + script "../main.nf" + process "CAT_CAT" + tag "modules" + tag "modules_nfcore" + tag "cat" + tag "cat/cat" + + test("test_cat_name_conflict") { + when { + params { + outdir = "${outputDir}" + } + process { + """ + input[0] = + [ + [ id:'genome', single_end:true ], + [ + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.sizes', checkIfExists: true) + ] + ] + """ + } + } + then { + assertAll( + { assert !process.success }, + { assert process.stdout.toString().contains("The name of the input file can't be the same as for the output prefix") }, + { assert snapshot(process.out.versions).match() } + ) + } + } + + test("test_cat_unzipped_unzipped") { + when { + params { + outdir = "${outputDir}" + } + process { + """ + input[0] = + [ + [ id:'test', single_end:true ], + [ + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.sizes', checkIfExists: true) + ] + ] + """ + } + } + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } + + + test("test_cat_zipped_zipped") { + when { + params { + outdir = "${outputDir}" + } + process { + """ + input[0] = + [ + [ id:'test', single_end:true ], + [ + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.gff3.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/alignment/last/contigs.genome.maf.gz', checkIfExists: true) + ] + ] + """ + } + } + then { + def lines = path(process.out.file_out.get(0).get(1)).linesGzip + assertAll( + { assert process.success }, + { assert snapshot( + lines[0..5], + lines.size(), + process.out.versions + ).match() + } + ) + } + } + + test("test_cat_zipped_unzipped") { + config './nextflow_zipped_unzipped.config' + + when { + params { + outdir = "${outputDir}" + } + process { + """ + input[0] = + [ + [ id:'test', single_end:true ], + [ + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.gff3.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/alignment/last/contigs.genome.maf.gz', checkIfExists: true) + ] + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + + test("test_cat_unzipped_zipped") { + config './nextflow_unzipped_zipped.config' + when { + params { + outdir = "${outputDir}" + } + process { + """ + input[0] = + [ + [ id:'test', single_end:true ], + [ + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.sizes', checkIfExists: true) + ] + ] + """ + } + } + then { + def lines = path(process.out.file_out.get(0).get(1)).linesGzip + assertAll( + { assert process.success }, + { assert snapshot( + lines[0..5], + lines.size(), + process.out.versions + ).match() + } + ) + } + } + + test("test_cat_one_file_unzipped_zipped") { + config './nextflow_unzipped_zipped.config' + when { + params { + outdir = "${outputDir}" + } + process { + """ + input[0] = + [ + [ id:'test', single_end:true ], + [ + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) + ] + ] + """ + } + } + then { + def lines = path(process.out.file_out.get(0).get(1)).linesGzip + assertAll( + { assert process.success }, + { assert snapshot( + lines[0..5], + lines.size(), + process.out.versions + ).match() + } + ) + } + } +} diff --git a/hello-nf-core/solutions/core-hello-part5/modules/nf-core/cat/cat/tests/main.nf.test.snap b/hello-nf-core/solutions/core-hello-part5/modules/nf-core/cat/cat/tests/main.nf.test.snap new file mode 100644 index 0000000000..e2381ca20b --- /dev/null +++ b/hello-nf-core/solutions/core-hello-part5/modules/nf-core/cat/cat/tests/main.nf.test.snap @@ -0,0 +1,147 @@ +{ + "test_cat_unzipped_unzipped": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fasta:md5,f44b33a0e441ad58b2d3700270e2dbe2" + ] + ], + "1": [ + "versions.yml:md5,115ed6177ebcff24eb99d503fa5ef894" + ], + "file_out": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fasta:md5,f44b33a0e441ad58b2d3700270e2dbe2" + ] + ], + "versions": [ + "versions.yml:md5,115ed6177ebcff24eb99d503fa5ef894" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.3" + }, + "timestamp": "2023-10-16T14:32:18.500464399" + }, + "test_cat_zipped_unzipped": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": true + }, + "cat.txt:md5,c439d3b60e7bc03e8802a451a0d9a5d9" + ] + ], + "1": [ + "versions.yml:md5,115ed6177ebcff24eb99d503fa5ef894" + ], + "file_out": [ + [ + { + "id": "test", + "single_end": true + }, + "cat.txt:md5,c439d3b60e7bc03e8802a451a0d9a5d9" + ] + ], + "versions": [ + "versions.yml:md5,115ed6177ebcff24eb99d503fa5ef894" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.3" + }, + "timestamp": "2023-10-16T14:32:49.642741302" + }, + "test_cat_zipped_zipped": { + "content": [ + [ + "MT192765.1\tGenbank\ttranscript\t259\t29667\t.\t+\t.\tID=unknown_transcript_1;geneID=orf1ab;gene_name=orf1ab", + "MT192765.1\tGenbank\tgene\t259\t21548\t.\t+\t.\tParent=unknown_transcript_1", + "MT192765.1\tGenbank\tCDS\t259\t13461\t.\t+\t0\tParent=unknown_transcript_1;exception=\"ribosomal slippage\";gbkey=CDS;gene=orf1ab;note=\"pp1ab;translated=by -1 ribosomal frameshift\";product=\"orf1ab polyprotein\";protein_id=QIK50426.1", + "MT192765.1\tGenbank\tCDS\t13461\t21548\t.\t+\t0\tParent=unknown_transcript_1;exception=\"ribosomal slippage\";gbkey=CDS;gene=orf1ab;note=\"pp1ab;translated=by -1 ribosomal frameshift\";product=\"orf1ab polyprotein\";protein_id=QIK50426.1", + "MT192765.1\tGenbank\tCDS\t21556\t25377\t.\t+\t0\tParent=unknown_transcript_1;gbkey=CDS;gene=S;note=\"structural protein\";product=\"surface glycoprotein\";protein_id=QIK50427.1", + "MT192765.1\tGenbank\tgene\t21556\t25377\t.\t+\t.\tParent=unknown_transcript_1" + ], + 78, + [ + "versions.yml:md5,115ed6177ebcff24eb99d503fa5ef894" + ] + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.3" + }, + "timestamp": "2024-07-22T11:51:46.802978" + }, + "test_cat_name_conflict": { + "content": [ + [ + + ] + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.3" + }, + "timestamp": "2024-07-22T11:51:29.45394" + }, + "test_cat_one_file_unzipped_zipped": { + "content": [ + [ + ">MT192765.1 Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/PC00101P/2020, complete genome", + "GTTTATACCTTCCCAGGTAACAAACCAACCAACTTTCGATCTCTTGTAGATCTGTTCTCTAAACGAACTTTAAAATCTGT", + "GTGGCTGTCACTCGGCTGCATGCTTAGTGCACTCACGCAGTATAATTAATAACTAATTACTGTCGTTGACAGGACACGAG", + "TAACTCGTCTATCTTCTGCAGGCTGCTTACGGTTTCGTCCGTGTTGCAGCCGATCATCAGCACATCTAGGTTTTGTCCGG", + "GTGTGACCGAAAGGTAAGATGGAGAGCCTTGTCCCTGGTTTCAACGAGAAAACACACGTCCAACTCAGTTTGCCTGTTTT", + "ACAGGTTCGCGACGTGCTCGTACGTGGCTTTGGAGACTCCGTGGAGGAGGTCTTATCAGAGGCACGTCAACATCTTAAAG" + ], + 374, + [ + "versions.yml:md5,115ed6177ebcff24eb99d503fa5ef894" + ] + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.3" + }, + "timestamp": "2024-07-22T11:52:02.774016" + }, + "test_cat_unzipped_zipped": { + "content": [ + [ + ">MT192765.1 Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/PC00101P/2020, complete genome", + "GTTTATACCTTCCCAGGTAACAAACCAACCAACTTTCGATCTCTTGTAGATCTGTTCTCTAAACGAACTTTAAAATCTGT", + "GTGGCTGTCACTCGGCTGCATGCTTAGTGCACTCACGCAGTATAATTAATAACTAATTACTGTCGTTGACAGGACACGAG", + "TAACTCGTCTATCTTCTGCAGGCTGCTTACGGTTTCGTCCGTGTTGCAGCCGATCATCAGCACATCTAGGTTTTGTCCGG", + "GTGTGACCGAAAGGTAAGATGGAGAGCCTTGTCCCTGGTTTCAACGAGAAAACACACGTCCAACTCAGTTTGCCTGTTTT", + "ACAGGTTCGCGACGTGCTCGTACGTGGCTTTGGAGACTCCGTGGAGGAGGTCTTATCAGAGGCACGTCAACATCTTAAAG" + ], + 375, + [ + "versions.yml:md5,115ed6177ebcff24eb99d503fa5ef894" + ] + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.3" + }, + "timestamp": "2024-07-22T11:51:57.581523" + } +} diff --git a/hello-nf-core/solutions/core-hello-part5/modules/nf-core/cat/cat/tests/nextflow_unzipped_zipped.config b/hello-nf-core/solutions/core-hello-part5/modules/nf-core/cat/cat/tests/nextflow_unzipped_zipped.config new file mode 100644 index 0000000000..ec26b0fdc6 --- /dev/null +++ b/hello-nf-core/solutions/core-hello-part5/modules/nf-core/cat/cat/tests/nextflow_unzipped_zipped.config @@ -0,0 +1,6 @@ + +process { + withName: CAT_CAT { + ext.prefix = 'cat.txt.gz' + } +} diff --git a/hello-nf-core/solutions/core-hello-part5/modules/nf-core/cat/cat/tests/nextflow_zipped_unzipped.config b/hello-nf-core/solutions/core-hello-part5/modules/nf-core/cat/cat/tests/nextflow_zipped_unzipped.config new file mode 100644 index 0000000000..fbc79783d5 --- /dev/null +++ b/hello-nf-core/solutions/core-hello-part5/modules/nf-core/cat/cat/tests/nextflow_zipped_unzipped.config @@ -0,0 +1,8 @@ + +process { + + withName: CAT_CAT { + ext.prefix = 'cat.txt' + } + +} diff --git a/hello-nf-core/solutions/core-hello-part5/nextflow.config b/hello-nf-core/solutions/core-hello-part5/nextflow.config new file mode 100644 index 0000000000..b59ba2175d --- /dev/null +++ b/hello-nf-core/solutions/core-hello-part5/nextflow.config @@ -0,0 +1,252 @@ +/* +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + core/hello Nextflow config file +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + Default config options for all compute environments +---------------------------------------------------------------------------------------- +*/ + +// Global default params, used in configs +params { + + // TODO nf-core: Specify your pipeline's command line flags + // Input options + input = null + + // Boilerplate options + outdir = null + publish_dir_mode = 'copy' + monochrome_logs = false + help = false + help_full = false + show_hidden = false + version = false + pipelines_testdata_base_path = 'https://raw.githubusercontent.com/nf-core/test-datasets/' + trace_report_suffix = new java.util.Date().format( 'yyyy-MM-dd_HH-mm-ss') + + // Config options + config_profile_name = null + config_profile_description = null + + custom_config_version = 'master' + custom_config_base = "https://raw.githubusercontent.com/nf-core/configs/${params.custom_config_version}" + config_profile_contact = null + config_profile_url = null + + // Schema validation default options + validate_params = true +} + +// Load base.config by default for all pipelines +includeConfig 'conf/base.config' + +profiles { + debug { + dumpHashes = true + process.beforeScript = 'echo $HOSTNAME' + cleanup = false + nextflow.enable.configProcessNamesValidation = true + } + conda { + conda.enabled = true + docker.enabled = false + singularity.enabled = false + podman.enabled = false + shifter.enabled = false + charliecloud.enabled = false + conda.channels = ['conda-forge', 'bioconda'] + apptainer.enabled = false + } + mamba { + conda.enabled = true + conda.useMamba = true + docker.enabled = false + singularity.enabled = false + podman.enabled = false + shifter.enabled = false + charliecloud.enabled = false + apptainer.enabled = false + } + docker { + docker.enabled = true + conda.enabled = false + singularity.enabled = false + podman.enabled = false + shifter.enabled = false + charliecloud.enabled = false + apptainer.enabled = false + docker.runOptions = '-u $(id -u):$(id -g)' + } + arm64 { + process.arch = 'arm64' + // TODO https://github.com/nf-core/modules/issues/6694 + // For now if you're using arm64 you have to use wave for the sake of the maintainers + // wave profile + apptainer.ociAutoPull = true + singularity.ociAutoPull = true + wave.enabled = true + wave.freeze = true + wave.strategy = 'conda,container' + } + emulate_amd64 { + docker.runOptions = '-u $(id -u):$(id -g) --platform=linux/amd64' + } + singularity { + singularity.enabled = true + singularity.autoMounts = true + conda.enabled = false + docker.enabled = false + podman.enabled = false + shifter.enabled = false + charliecloud.enabled = false + apptainer.enabled = false + } + podman { + podman.enabled = true + conda.enabled = false + docker.enabled = false + singularity.enabled = false + shifter.enabled = false + charliecloud.enabled = false + apptainer.enabled = false + } + shifter { + shifter.enabled = true + conda.enabled = false + docker.enabled = false + singularity.enabled = false + podman.enabled = false + charliecloud.enabled = false + apptainer.enabled = false + } + charliecloud { + charliecloud.enabled = true + conda.enabled = false + docker.enabled = false + singularity.enabled = false + podman.enabled = false + shifter.enabled = false + apptainer.enabled = false + } + apptainer { + apptainer.enabled = true + apptainer.autoMounts = true + conda.enabled = false + docker.enabled = false + singularity.enabled = false + podman.enabled = false + shifter.enabled = false + charliecloud.enabled = false + } + wave { + apptainer.ociAutoPull = true + singularity.ociAutoPull = true + wave.enabled = true + wave.freeze = true + wave.strategy = 'conda,container' + } + gpu { + docker.runOptions = '-u $(id -u):$(id -g) --gpus all' + apptainer.runOptions = '--nv' + singularity.runOptions = '--nv' + } + test { includeConfig 'conf/test.config' } + test_full { includeConfig 'conf/test_full.config' } +} +// Load nf-core custom profiles from different institutions + +// If params.custom_config_base is set AND either the NXF_OFFLINE environment variable is not set or params.custom_config_base is a local path, the nfcore_custom.config file from the specified base path is included. +// Load core/hello custom profiles from different institutions. +includeConfig params.custom_config_base && (!System.getenv('NXF_OFFLINE') || !params.custom_config_base.startsWith('http')) ? "${params.custom_config_base}/nfcore_custom.config" : "/dev/null" + + +// Load core/hello custom profiles from different institutions. +// TODO nf-core: Optionally, you can add a pipeline-specific nf-core config at https://github.com/nf-core/configs +// includeConfig params.custom_config_base && (!System.getenv('NXF_OFFLINE') || !params.custom_config_base.startsWith('http')) ? "${params.custom_config_base}/pipeline/hello.config" : "/dev/null" + +// Set default registry for Apptainer, Docker, Podman, Charliecloud and Singularity independent of -profile +// Will not be used unless Apptainer / Docker / Podman / Charliecloud / Singularity are enabled +// Set to your registry if you have a mirror of containers +apptainer.registry = 'quay.io' +docker.registry = 'quay.io' +podman.registry = 'quay.io' +singularity.registry = 'quay.io' +charliecloud.registry = 'quay.io' + + + +// Export these variables to prevent local Python/R libraries from conflicting with those in the container +// The JULIA depot path has been adjusted to a fixed path `/usr/local/share/julia` that needs to be used for packages in the container. +// See https://apeltzer.github.io/post/03-julia-lang-nextflow/ for details on that. Once we have a common agreement on where to keep Julia packages, this is adjustable. + +env { + PYTHONNOUSERSITE = 1 + R_PROFILE_USER = "/.Rprofile" + R_ENVIRON_USER = "/.Renviron" + JULIA_DEPOT_PATH = "/usr/local/share/julia" +} + +// Set bash options +process.shell = [ + "bash", + "-C", // No clobber - prevent output redirection from overwriting files. + "-e", // Exit if a tool returns a non-zero status/exit code + "-u", // Treat unset variables and parameters as an error + "-o", // Returns the status of the last command to exit.. + "pipefail" // ..with a non-zero status or zero if all successfully execute +] + +// Disable process selector warnings by default. Use debug profile to enable warnings. +nextflow.enable.configProcessNamesValidation = false + +timeline { + enabled = true + file = "${params.outdir}/pipeline_info/execution_timeline_${params.trace_report_suffix}.html" +} +report { + enabled = true + file = "${params.outdir}/pipeline_info/execution_report_${params.trace_report_suffix}.html" +} +trace { + enabled = true + file = "${params.outdir}/pipeline_info/execution_trace_${params.trace_report_suffix}.txt" +} +dag { + enabled = true + file = "${params.outdir}/pipeline_info/pipeline_dag_${params.trace_report_suffix}.html" +} + +manifest { + name = 'core/hello' + contributors = [ + // TODO nf-core: Update the field with the details of the contributors to your pipeline. New with Nextflow version 24.10.0 + [ + name: 'pinin4fjords', + affiliation: '', + email: '', + github: '', + contribution: [], // List of contribution types ('author', 'maintainer' or 'contributor') + orcid: '' + ], + ] + homePage = 'https://github.com/core/hello' + description = """A basic nf-core style version of Hello Nextflow""" + mainScript = 'main.nf' + defaultBranch = 'main' + nextflowVersion = '!>=25.04.0' + version = '1.0.0dev' + doi = '' +} + +// Nextflow plugins +plugins { + id 'nf-schema@2.5.1' // Validation of pipeline parameters and creation of an input channel from a sample sheet +} + +validation { + defaultIgnoreParams = ["genomes"] + monochromeLogs = params.monochrome_logs +} + +// Load modules.config for DSL2 module specific options +includeConfig 'conf/modules.config' diff --git a/hello-nf-core/solutions/core-hello-part5/nextflow_schema.json b/hello-nf-core/solutions/core-hello-part5/nextflow_schema.json new file mode 100644 index 0000000000..5dcce69565 --- /dev/null +++ b/hello-nf-core/solutions/core-hello-part5/nextflow_schema.json @@ -0,0 +1,168 @@ +{ + "$schema": "https://json-schema.org/draft/2020-12/schema", + "$id": "https://raw.githubusercontent.com/core/hello/main/nextflow_schema.json", + "title": "core/hello pipeline parameters", + "description": "A basic nf-core style version of Hello Nextflow", + "type": "object", + "$defs": { + "input_output_options": { + "title": "Input/output options", + "type": "object", + "fa_icon": "fas fa-terminal", + "description": "Define where the pipeline should find input data and save output data.", + "required": ["input", "outdir", "batch"], + "properties": { + "input": { + "type": "string", + "format": "file-path", + "exists": true, + "schema": "assets/schema_input.json", + "mimetype": "text/csv", + "pattern": "^\\S+\\.csv$", + "description": "Path to comma-separated file containing information about the samples in the experiment.", + "help_text": "You will need to create a design file with information about the samples in your experiment before running the pipeline. Use this parameter to specify its location. It has to be a comma-separated file with 3 columns, and a header row.", + "fa_icon": "fas fa-file-csv" + }, + "batch": { + "type": "string", + "description": "Name for this batch of greetings", + "fa_icon": "fas fa-layer-group" + }, + "outdir": { + "type": "string", + "format": "directory-path", + "description": "The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure.", + "fa_icon": "fas fa-folder-open" + } + } + }, + "institutional_config_options": { + "title": "Institutional config options", + "type": "object", + "fa_icon": "fas fa-university", + "description": "Parameters used to describe centralised config profiles. These should not be edited.", + "help_text": "The centralised nf-core configuration profiles use a handful of pipeline parameters to describe themselves. This information is then printed to the Nextflow log when you run a pipeline. You should not need to change these values when you run a pipeline.", + "properties": { + "custom_config_version": { + "type": "string", + "description": "Git commit id for Institutional configs.", + "default": "master", + "hidden": true, + "fa_icon": "fas fa-users-cog" + }, + "custom_config_base": { + "type": "string", + "description": "Base directory for Institutional configs.", + "default": "https://raw.githubusercontent.com/nf-core/configs/master", + "hidden": true, + "help_text": "If you're running offline, Nextflow will not be able to fetch the institutional config files from the internet. If you don't need them, then this is not a problem. If you do need them, you should download the files from the repo and tell Nextflow where to find them with this parameter.", + "fa_icon": "fas fa-users-cog" + }, + "config_profile_name": { + "type": "string", + "description": "Institutional config name.", + "hidden": true, + "fa_icon": "fas fa-users-cog" + }, + "config_profile_description": { + "type": "string", + "description": "Institutional config description.", + "hidden": true, + "fa_icon": "fas fa-users-cog" + }, + "config_profile_contact": { + "type": "string", + "description": "Institutional config contact information.", + "hidden": true, + "fa_icon": "fas fa-users-cog" + }, + "config_profile_url": { + "type": "string", + "description": "Institutional config URL link.", + "hidden": true, + "fa_icon": "fas fa-users-cog" + } + } + }, + "generic_options": { + "title": "Generic options", + "type": "object", + "fa_icon": "fas fa-file-import", + "description": "Less common options for the pipeline, typically set in a config file.", + "help_text": "These options are common to all nf-core pipelines and allow you to customise some of the core preferences for how the pipeline runs.\n\nTypically these options would be set in a Nextflow config file loaded for all pipeline runs, such as `~/.nextflow/config`.", + "properties": { + "version": { + "type": "boolean", + "description": "Display version and exit.", + "fa_icon": "fas fa-question-circle", + "hidden": true + }, + "publish_dir_mode": { + "type": "string", + "default": "copy", + "description": "Method used to save pipeline results to output directory.", + "help_text": "The Nextflow `publishDir` option specifies which intermediate files should be saved to the output directory. This option tells the pipeline what method should be used to move these files. See [Nextflow docs](https://www.nextflow.io/docs/latest/process.html#publishdir) for details.", + "fa_icon": "fas fa-copy", + "enum": [ + "symlink", + "rellink", + "link", + "copy", + "copyNoFollow", + "move" + ], + "hidden": true + }, + "monochrome_logs": { + "type": "boolean", + "description": "Do not use coloured log outputs.", + "fa_icon": "fas fa-palette", + "hidden": true + }, + "validate_params": { + "type": "boolean", + "description": "Boolean whether to validate parameters against the schema at runtime", + "default": true, + "fa_icon": "fas fa-check-square", + "hidden": true + }, + "pipelines_testdata_base_path": { + "type": "string", + "fa_icon": "far fa-check-circle", + "description": "Base URL or local path to location of pipeline test dataset files", + "default": "https://raw.githubusercontent.com/nf-core/test-datasets/", + "hidden": true + }, + "trace_report_suffix": { + "type": "string", + "fa_icon": "far calendar", + "description": "Suffix to add to the trace report filename. Default is the date and time in the format yyyy-MM-dd_HH-mm-ss.", + "hidden": true + }, + "help": { + "type": ["boolean", "string"], + "description": "Display the help message." + }, + "help_full": { + "type": "boolean", + "description": "Display the full detailed help message." + }, + "show_hidden": { + "type": "boolean", + "description": "Display hidden parameters in the help message (only works when --help or --help_full are provided)." + } + } + } + }, + "allOf": [ + { + "$ref": "#/$defs/input_output_options" + }, + { + "$ref": "#/$defs/institutional_config_options" + }, + { + "$ref": "#/$defs/generic_options" + } + ] +} diff --git a/hello-nf-core/solutions/core-hello-part5/subworkflows/local/utils_nfcore_hello_pipeline/main.nf b/hello-nf-core/solutions/core-hello-part5/subworkflows/local/utils_nfcore_hello_pipeline/main.nf new file mode 100644 index 0000000000..93c9f874cc --- /dev/null +++ b/hello-nf-core/solutions/core-hello-part5/subworkflows/local/utils_nfcore_hello_pipeline/main.nf @@ -0,0 +1,136 @@ +// +// Subworkflow with functionality specific to the core/hello pipeline +// + +/* +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + IMPORT FUNCTIONS / MODULES / SUBWORKFLOWS +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +*/ + +include { UTILS_NFSCHEMA_PLUGIN } from '../../nf-core/utils_nfschema_plugin' +include { paramsSummaryMap } from 'plugin/nf-schema' +include { samplesheetToList } from 'plugin/nf-schema' +include { paramsHelp } from 'plugin/nf-schema' +include { completionSummary } from '../../nf-core/utils_nfcore_pipeline' +include { UTILS_NFCORE_PIPELINE } from '../../nf-core/utils_nfcore_pipeline' +include { UTILS_NEXTFLOW_PIPELINE } from '../../nf-core/utils_nextflow_pipeline' + +/* +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + SUBWORKFLOW TO INITIALISE PIPELINE +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +*/ + +workflow PIPELINE_INITIALISATION { + + take: + version // boolean: Display version and exit + validate_params // boolean: Boolean whether to validate parameters against the schema at runtime + monochrome_logs // boolean: Do not use coloured log outputs + nextflow_cli_args // array: List of positional nextflow CLI args + outdir // string: The output directory where the results will be saved + input // string: Path to input samplesheet + help // boolean: Display help message and exit + help_full // boolean: Show the full help message + show_hidden // boolean: Show hidden parameters in the help message + + main: + + ch_versions = Channel.empty() + + // + // Print version and exit if required and dump pipeline parameters to JSON file + // + UTILS_NEXTFLOW_PIPELINE ( + version, + true, + outdir, + workflow.profile.tokenize(',').intersect(['conda', 'mamba']).size() >= 1 + ) + + // + // Validate parameters and generate parameter summary to stdout + // + command = "nextflow run ${workflow.manifest.name} -profile --input samplesheet.csv --outdir " + + UTILS_NFSCHEMA_PLUGIN ( + workflow, + validate_params, + null, + help, + help_full, + show_hidden, + "", + "", + command + ) + + // + // Check config provided to the pipeline + // + UTILS_NFCORE_PIPELINE ( + nextflow_cli_args + ) + + // + // Create channel from input file provided through params.input + // + + ch_samplesheet = channel.fromPath(params.input) + .splitCsv() + .map { line -> line[0] } + + emit: + samplesheet = ch_samplesheet + versions = ch_versions +} + +/* +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + SUBWORKFLOW FOR PIPELINE COMPLETION +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +*/ + +workflow PIPELINE_COMPLETION { + + take: + outdir // path: Path to output directory where results will be published + monochrome_logs // boolean: Disable ANSI colour codes in log output + + main: + summary_params = paramsSummaryMap(workflow, parameters_schema: "nextflow_schema.json") + + // + // Completion email and summary + // + workflow.onComplete { + + completionSummary(monochrome_logs) + } + + workflow.onError { + log.error "Pipeline failed. Please refer to troubleshooting docs: https://nf-co.re/docs/usage/troubleshooting" + } +} + +/* +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + FUNCTIONS +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +*/ + +// +// Validate channels from input samplesheet +// +def validateInputSamplesheet(input) { + def (metas, fastqs) = input[1..2] + + // Check that multiple runs of the same sample are of the same datatype i.e. single-end / paired-end + def endedness_ok = metas.collect{ meta -> meta.single_end }.unique().size == 1 + if (!endedness_ok) { + error("Please check input samplesheet -> Multiple runs of a sample must be of the same datatype i.e. single-end or paired-end: ${metas[0].id}") + } + + return [ metas[0], fastqs ] +} diff --git a/hello-nf-core/solutions/core-hello-part5/subworkflows/nf-core/utils_nextflow_pipeline/main.nf b/hello-nf-core/solutions/core-hello-part5/subworkflows/nf-core/utils_nextflow_pipeline/main.nf new file mode 100644 index 0000000000..d6e593e852 --- /dev/null +++ b/hello-nf-core/solutions/core-hello-part5/subworkflows/nf-core/utils_nextflow_pipeline/main.nf @@ -0,0 +1,126 @@ +// +// Subworkflow with functionality that may be useful for any Nextflow pipeline +// + +/* +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + SUBWORKFLOW DEFINITION +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +*/ + +workflow UTILS_NEXTFLOW_PIPELINE { + take: + print_version // boolean: print version + dump_parameters // boolean: dump parameters + outdir // path: base directory used to publish pipeline results + check_conda_channels // boolean: check conda channels + + main: + + // + // Print workflow version and exit on --version + // + if (print_version) { + log.info("${workflow.manifest.name} ${getWorkflowVersion()}") + System.exit(0) + } + + // + // Dump pipeline parameters to a JSON file + // + if (dump_parameters && outdir) { + dumpParametersToJSON(outdir) + } + + // + // When running with Conda, warn if channels have not been set-up appropriately + // + if (check_conda_channels) { + checkCondaChannels() + } + + emit: + dummy_emit = true +} + +/* +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + FUNCTIONS +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +*/ + +// +// Generate version string +// +def getWorkflowVersion() { + def version_string = "" as String + if (workflow.manifest.version) { + def prefix_v = workflow.manifest.version[0] != 'v' ? 'v' : '' + version_string += "${prefix_v}${workflow.manifest.version}" + } + + if (workflow.commitId) { + def git_shortsha = workflow.commitId.substring(0, 7) + version_string += "-g${git_shortsha}" + } + + return version_string +} + +// +// Dump pipeline parameters to a JSON file +// +def dumpParametersToJSON(outdir) { + def timestamp = new java.util.Date().format('yyyy-MM-dd_HH-mm-ss') + def filename = "params_${timestamp}.json" + def temp_pf = new File(workflow.launchDir.toString(), ".${filename}") + def jsonStr = groovy.json.JsonOutput.toJson(params) + temp_pf.text = groovy.json.JsonOutput.prettyPrint(jsonStr) + + nextflow.extension.FilesEx.copyTo(temp_pf.toPath(), "${outdir}/pipeline_info/params_${timestamp}.json") + temp_pf.delete() +} + +// +// When running with -profile conda, warn if channels have not been set-up appropriately +// +def checkCondaChannels() { + def parser = new org.yaml.snakeyaml.Yaml() + def channels = [] + try { + def config = parser.load("conda config --show channels".execute().text) + channels = config.channels + } + catch (NullPointerException e) { + log.debug(e) + log.warn("Could not verify conda channel configuration.") + return null + } + catch (IOException e) { + log.debug(e) + log.warn("Could not verify conda channel configuration.") + return null + } + + // Check that all channels are present + // This channel list is ordered by required channel priority. + def required_channels_in_order = ['conda-forge', 'bioconda'] + def channels_missing = ((required_channels_in_order as Set) - (channels as Set)) as Boolean + + // Check that they are in the right order + def channel_priority_violation = required_channels_in_order != channels.findAll { ch -> ch in required_channels_in_order } + + if (channels_missing | channel_priority_violation) { + log.warn """\ + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + There is a problem with your Conda configuration! + You will need to set-up the conda-forge and bioconda channels correctly. + Please refer to https://bioconda.github.io/ + The observed channel order is + ${channels} + but the following channel order is required: + ${required_channels_in_order} + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~" + """.stripIndent(true) + } +} diff --git a/hello-nf-core/solutions/core-hello-part5/subworkflows/nf-core/utils_nextflow_pipeline/meta.yml b/hello-nf-core/solutions/core-hello-part5/subworkflows/nf-core/utils_nextflow_pipeline/meta.yml new file mode 100644 index 0000000000..e5c3a0a828 --- /dev/null +++ b/hello-nf-core/solutions/core-hello-part5/subworkflows/nf-core/utils_nextflow_pipeline/meta.yml @@ -0,0 +1,38 @@ +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/subworkflows/yaml-schema.json +name: "UTILS_NEXTFLOW_PIPELINE" +description: Subworkflow with functionality that may be useful for any Nextflow pipeline +keywords: + - utility + - pipeline + - initialise + - version +components: [] +input: + - print_version: + type: boolean + description: | + Print the version of the pipeline and exit + - dump_parameters: + type: boolean + description: | + Dump the parameters of the pipeline to a JSON file + - output_directory: + type: directory + description: Path to output dir to write JSON file to. + pattern: "results/" + - check_conda_channel: + type: boolean + description: | + Check if the conda channel priority is correct. +output: + - dummy_emit: + type: boolean + description: | + Dummy emit to make nf-core subworkflows lint happy +authors: + - "@adamrtalbot" + - "@drpatelh" +maintainers: + - "@adamrtalbot" + - "@drpatelh" + - "@maxulysse" diff --git a/hello-nf-core/solutions/core-hello-part5/subworkflows/nf-core/utils_nextflow_pipeline/tests/main.function.nf.test b/hello-nf-core/solutions/core-hello-part5/subworkflows/nf-core/utils_nextflow_pipeline/tests/main.function.nf.test new file mode 100644 index 0000000000..68718e4f59 --- /dev/null +++ b/hello-nf-core/solutions/core-hello-part5/subworkflows/nf-core/utils_nextflow_pipeline/tests/main.function.nf.test @@ -0,0 +1,54 @@ + +nextflow_function { + + name "Test Functions" + script "subworkflows/nf-core/utils_nextflow_pipeline/main.nf" + config "subworkflows/nf-core/utils_nextflow_pipeline/tests/nextflow.config" + tag 'subworkflows' + tag 'utils_nextflow_pipeline' + tag 'subworkflows/utils_nextflow_pipeline' + + test("Test Function getWorkflowVersion") { + + function "getWorkflowVersion" + + then { + assertAll( + { assert function.success }, + { assert snapshot(function.result).match() } + ) + } + } + + test("Test Function dumpParametersToJSON") { + + function "dumpParametersToJSON" + + when { + function { + """ + // define inputs of the function here. Example: + input[0] = "$outputDir" + """.stripIndent() + } + } + + then { + assertAll( + { assert function.success } + ) + } + } + + test("Test Function checkCondaChannels") { + + function "checkCondaChannels" + + then { + assertAll( + { assert function.success }, + { assert snapshot(function.result).match() } + ) + } + } +} diff --git a/hello-nf-core/solutions/core-hello-part5/subworkflows/nf-core/utils_nextflow_pipeline/tests/main.function.nf.test.snap b/hello-nf-core/solutions/core-hello-part5/subworkflows/nf-core/utils_nextflow_pipeline/tests/main.function.nf.test.snap new file mode 100644 index 0000000000..846287c417 --- /dev/null +++ b/hello-nf-core/solutions/core-hello-part5/subworkflows/nf-core/utils_nextflow_pipeline/tests/main.function.nf.test.snap @@ -0,0 +1,20 @@ +{ + "Test Function getWorkflowVersion": { + "content": [ + "v9.9.9" + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "23.10.1" + }, + "timestamp": "2024-02-28T12:02:05.308243" + }, + "Test Function checkCondaChannels": { + "content": null, + "meta": { + "nf-test": "0.8.4", + "nextflow": "23.10.1" + }, + "timestamp": "2024-02-28T12:02:12.425833" + } +} diff --git a/hello-nf-core/solutions/core-hello-part5/subworkflows/nf-core/utils_nextflow_pipeline/tests/main.workflow.nf.test b/hello-nf-core/solutions/core-hello-part5/subworkflows/nf-core/utils_nextflow_pipeline/tests/main.workflow.nf.test new file mode 100644 index 0000000000..02dbf094cd --- /dev/null +++ b/hello-nf-core/solutions/core-hello-part5/subworkflows/nf-core/utils_nextflow_pipeline/tests/main.workflow.nf.test @@ -0,0 +1,113 @@ +nextflow_workflow { + + name "Test Workflow UTILS_NEXTFLOW_PIPELINE" + script "../main.nf" + config "subworkflows/nf-core/utils_nextflow_pipeline/tests/nextflow.config" + workflow "UTILS_NEXTFLOW_PIPELINE" + tag 'subworkflows' + tag 'utils_nextflow_pipeline' + tag 'subworkflows/utils_nextflow_pipeline' + + test("Should run no inputs") { + + when { + workflow { + """ + print_version = false + dump_parameters = false + outdir = null + check_conda_channels = false + + input[0] = print_version + input[1] = dump_parameters + input[2] = outdir + input[3] = check_conda_channels + """ + } + } + + then { + assertAll( + { assert workflow.success } + ) + } + } + + test("Should print version") { + + when { + workflow { + """ + print_version = true + dump_parameters = false + outdir = null + check_conda_channels = false + + input[0] = print_version + input[1] = dump_parameters + input[2] = outdir + input[3] = check_conda_channels + """ + } + } + + then { + expect { + with(workflow) { + assert success + assert "nextflow_workflow v9.9.9" in stdout + } + } + } + } + + test("Should dump params") { + + when { + workflow { + """ + print_version = false + dump_parameters = true + outdir = 'results' + check_conda_channels = false + + input[0] = false + input[1] = true + input[2] = outdir + input[3] = false + """ + } + } + + then { + assertAll( + { assert workflow.success } + ) + } + } + + test("Should not create params JSON if no output directory") { + + when { + workflow { + """ + print_version = false + dump_parameters = true + outdir = null + check_conda_channels = false + + input[0] = false + input[1] = true + input[2] = outdir + input[3] = false + """ + } + } + + then { + assertAll( + { assert workflow.success } + ) + } + } +} diff --git a/hello-nf-core/solutions/core-hello-part5/subworkflows/nf-core/utils_nextflow_pipeline/tests/nextflow.config b/hello-nf-core/solutions/core-hello-part5/subworkflows/nf-core/utils_nextflow_pipeline/tests/nextflow.config new file mode 100644 index 0000000000..a09572e5bb --- /dev/null +++ b/hello-nf-core/solutions/core-hello-part5/subworkflows/nf-core/utils_nextflow_pipeline/tests/nextflow.config @@ -0,0 +1,9 @@ +manifest { + name = 'nextflow_workflow' + author = """nf-core""" + homePage = 'https://127.0.0.1' + description = """Dummy pipeline""" + nextflowVersion = '!>=23.04.0' + version = '9.9.9' + doi = 'https://doi.org/10.5281/zenodo.5070524' +} diff --git a/hello-nf-core/solutions/core-hello-part5/subworkflows/nf-core/utils_nfcore_pipeline/main.nf b/hello-nf-core/solutions/core-hello-part5/subworkflows/nf-core/utils_nfcore_pipeline/main.nf new file mode 100644 index 0000000000..bfd258760d --- /dev/null +++ b/hello-nf-core/solutions/core-hello-part5/subworkflows/nf-core/utils_nfcore_pipeline/main.nf @@ -0,0 +1,419 @@ +// +// Subworkflow with utility functions specific to the nf-core pipeline template +// + +/* +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + SUBWORKFLOW DEFINITION +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +*/ + +workflow UTILS_NFCORE_PIPELINE { + take: + nextflow_cli_args + + main: + valid_config = checkConfigProvided() + checkProfileProvided(nextflow_cli_args) + + emit: + valid_config +} + +/* +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + FUNCTIONS +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +*/ + +// +// Warn if a -profile or Nextflow config has not been provided to run the pipeline +// +def checkConfigProvided() { + def valid_config = true as Boolean + if (workflow.profile == 'standard' && workflow.configFiles.size() <= 1) { + log.warn( + "[${workflow.manifest.name}] You are attempting to run the pipeline without any custom configuration!\n\n" + "This will be dependent on your local compute environment but can be achieved via one or more of the following:\n" + " (1) Using an existing pipeline profile e.g. `-profile docker` or `-profile singularity`\n" + " (2) Using an existing nf-core/configs for your Institution e.g. `-profile crick` or `-profile uppmax`\n" + " (3) Using your own local custom config e.g. `-c /path/to/your/custom.config`\n\n" + "Please refer to the quick start section and usage docs for the pipeline.\n " + ) + valid_config = false + } + return valid_config +} + +// +// Exit pipeline if --profile contains spaces +// +def checkProfileProvided(nextflow_cli_args) { + if (workflow.profile.endsWith(',')) { + error( + "The `-profile` option cannot end with a trailing comma, please remove it and re-run the pipeline!\n" + "HINT: A common mistake is to provide multiple values separated by spaces e.g. `-profile test, docker`.\n" + ) + } + if (nextflow_cli_args[0]) { + log.warn( + "nf-core pipelines do not accept positional arguments. The positional argument `${nextflow_cli_args[0]}` has been detected.\n" + "HINT: A common mistake is to provide multiple values separated by spaces e.g. `-profile test, docker`.\n" + ) + } +} + +// +// Generate workflow version string +// +def getWorkflowVersion() { + def version_string = "" as String + if (workflow.manifest.version) { + def prefix_v = workflow.manifest.version[0] != 'v' ? 'v' : '' + version_string += "${prefix_v}${workflow.manifest.version}" + } + + if (workflow.commitId) { + def git_shortsha = workflow.commitId.substring(0, 7) + version_string += "-g${git_shortsha}" + } + + return version_string +} + +// +// Get software versions for pipeline +// +def processVersionsFromYAML(yaml_file) { + def yaml = new org.yaml.snakeyaml.Yaml() + def versions = yaml.load(yaml_file).collectEntries { k, v -> [k.tokenize(':')[-1], v] } + return yaml.dumpAsMap(versions).trim() +} + +// +// Get workflow version for pipeline +// +def workflowVersionToYAML() { + return """ + Workflow: + ${workflow.manifest.name}: ${getWorkflowVersion()} + Nextflow: ${workflow.nextflow.version} + """.stripIndent().trim() +} + +// +// Get channel of software versions used in pipeline in YAML format +// +def softwareVersionsToYAML(ch_versions) { + return ch_versions.unique().map { version -> processVersionsFromYAML(version) }.unique().mix(Channel.of(workflowVersionToYAML())) +} + +// +// Get workflow summary for MultiQC +// +def paramsSummaryMultiqc(summary_params) { + def summary_section = '' + summary_params + .keySet() + .each { group -> + def group_params = summary_params.get(group) + // This gets the parameters of that particular group + if (group_params) { + summary_section += "

${group}

\n" + summary_section += "
\n" + group_params + .keySet() + .sort() + .each { param -> + summary_section += "
${param}
${group_params.get(param) ?: 'N/A'}
\n" + } + summary_section += "
\n" + } + } + + def yaml_file_text = "id: '${workflow.manifest.name.replace('/', '-')}-summary'\n" as String + yaml_file_text += "description: ' - this information is collected when the pipeline is started.'\n" + yaml_file_text += "section_name: '${workflow.manifest.name} Workflow Summary'\n" + yaml_file_text += "section_href: 'https://github.com/${workflow.manifest.name}'\n" + yaml_file_text += "plot_type: 'html'\n" + yaml_file_text += "data: |\n" + yaml_file_text += "${summary_section}" + + return yaml_file_text +} + +// +// ANSII colours used for terminal logging +// +def logColours(monochrome_logs=true) { + def colorcodes = [:] as Map + + // Reset / Meta + colorcodes['reset'] = monochrome_logs ? '' : "\033[0m" + colorcodes['bold'] = monochrome_logs ? '' : "\033[1m" + colorcodes['dim'] = monochrome_logs ? '' : "\033[2m" + colorcodes['underlined'] = monochrome_logs ? '' : "\033[4m" + colorcodes['blink'] = monochrome_logs ? '' : "\033[5m" + colorcodes['reverse'] = monochrome_logs ? '' : "\033[7m" + colorcodes['hidden'] = monochrome_logs ? '' : "\033[8m" + + // Regular Colors + colorcodes['black'] = monochrome_logs ? '' : "\033[0;30m" + colorcodes['red'] = monochrome_logs ? '' : "\033[0;31m" + colorcodes['green'] = monochrome_logs ? '' : "\033[0;32m" + colorcodes['yellow'] = monochrome_logs ? '' : "\033[0;33m" + colorcodes['blue'] = monochrome_logs ? '' : "\033[0;34m" + colorcodes['purple'] = monochrome_logs ? '' : "\033[0;35m" + colorcodes['cyan'] = monochrome_logs ? '' : "\033[0;36m" + colorcodes['white'] = monochrome_logs ? '' : "\033[0;37m" + + // Bold + colorcodes['bblack'] = monochrome_logs ? '' : "\033[1;30m" + colorcodes['bred'] = monochrome_logs ? '' : "\033[1;31m" + colorcodes['bgreen'] = monochrome_logs ? '' : "\033[1;32m" + colorcodes['byellow'] = monochrome_logs ? '' : "\033[1;33m" + colorcodes['bblue'] = monochrome_logs ? '' : "\033[1;34m" + colorcodes['bpurple'] = monochrome_logs ? '' : "\033[1;35m" + colorcodes['bcyan'] = monochrome_logs ? '' : "\033[1;36m" + colorcodes['bwhite'] = monochrome_logs ? '' : "\033[1;37m" + + // Underline + colorcodes['ublack'] = monochrome_logs ? '' : "\033[4;30m" + colorcodes['ured'] = monochrome_logs ? '' : "\033[4;31m" + colorcodes['ugreen'] = monochrome_logs ? '' : "\033[4;32m" + colorcodes['uyellow'] = monochrome_logs ? '' : "\033[4;33m" + colorcodes['ublue'] = monochrome_logs ? '' : "\033[4;34m" + colorcodes['upurple'] = monochrome_logs ? '' : "\033[4;35m" + colorcodes['ucyan'] = monochrome_logs ? '' : "\033[4;36m" + colorcodes['uwhite'] = monochrome_logs ? '' : "\033[4;37m" + + // High Intensity + colorcodes['iblack'] = monochrome_logs ? '' : "\033[0;90m" + colorcodes['ired'] = monochrome_logs ? '' : "\033[0;91m" + colorcodes['igreen'] = monochrome_logs ? '' : "\033[0;92m" + colorcodes['iyellow'] = monochrome_logs ? '' : "\033[0;93m" + colorcodes['iblue'] = monochrome_logs ? '' : "\033[0;94m" + colorcodes['ipurple'] = monochrome_logs ? '' : "\033[0;95m" + colorcodes['icyan'] = monochrome_logs ? '' : "\033[0;96m" + colorcodes['iwhite'] = monochrome_logs ? '' : "\033[0;97m" + + // Bold High Intensity + colorcodes['biblack'] = monochrome_logs ? '' : "\033[1;90m" + colorcodes['bired'] = monochrome_logs ? '' : "\033[1;91m" + colorcodes['bigreen'] = monochrome_logs ? '' : "\033[1;92m" + colorcodes['biyellow'] = monochrome_logs ? '' : "\033[1;93m" + colorcodes['biblue'] = monochrome_logs ? '' : "\033[1;94m" + colorcodes['bipurple'] = monochrome_logs ? '' : "\033[1;95m" + colorcodes['bicyan'] = monochrome_logs ? '' : "\033[1;96m" + colorcodes['biwhite'] = monochrome_logs ? '' : "\033[1;97m" + + return colorcodes +} + +// Return a single report from an object that may be a Path or List +// +def getSingleReport(multiqc_reports) { + if (multiqc_reports instanceof Path) { + return multiqc_reports + } else if (multiqc_reports instanceof List) { + if (multiqc_reports.size() == 0) { + log.warn("[${workflow.manifest.name}] No reports found from process 'MULTIQC'") + return null + } else if (multiqc_reports.size() == 1) { + return multiqc_reports.first() + } else { + log.warn("[${workflow.manifest.name}] Found multiple reports from process 'MULTIQC', will use only one") + return multiqc_reports.first() + } + } else { + return null + } +} + +// +// Construct and send completion email +// +def completionEmail(summary_params, email, email_on_fail, plaintext_email, outdir, monochrome_logs=true, multiqc_report=null) { + + // Set up the e-mail variables + def subject = "[${workflow.manifest.name}] Successful: ${workflow.runName}" + if (!workflow.success) { + subject = "[${workflow.manifest.name}] FAILED: ${workflow.runName}" + } + + def summary = [:] + summary_params + .keySet() + .sort() + .each { group -> + summary << summary_params[group] + } + + def misc_fields = [:] + misc_fields['Date Started'] = workflow.start + misc_fields['Date Completed'] = workflow.complete + misc_fields['Pipeline script file path'] = workflow.scriptFile + misc_fields['Pipeline script hash ID'] = workflow.scriptId + if (workflow.repository) { + misc_fields['Pipeline repository Git URL'] = workflow.repository + } + if (workflow.commitId) { + misc_fields['Pipeline repository Git Commit'] = workflow.commitId + } + if (workflow.revision) { + misc_fields['Pipeline Git branch/tag'] = workflow.revision + } + misc_fields['Nextflow Version'] = workflow.nextflow.version + misc_fields['Nextflow Build'] = workflow.nextflow.build + misc_fields['Nextflow Compile Timestamp'] = workflow.nextflow.timestamp + + def email_fields = [:] + email_fields['version'] = getWorkflowVersion() + email_fields['runName'] = workflow.runName + email_fields['success'] = workflow.success + email_fields['dateComplete'] = workflow.complete + email_fields['duration'] = workflow.duration + email_fields['exitStatus'] = workflow.exitStatus + email_fields['errorMessage'] = (workflow.errorMessage ?: 'None') + email_fields['errorReport'] = (workflow.errorReport ?: 'None') + email_fields['commandLine'] = workflow.commandLine + email_fields['projectDir'] = workflow.projectDir + email_fields['summary'] = summary << misc_fields + + // On success try attach the multiqc report + def mqc_report = getSingleReport(multiqc_report) + + // Check if we are only sending emails on failure + def email_address = email + if (!email && email_on_fail && !workflow.success) { + email_address = email_on_fail + } + + // Render the TXT template + def engine = new groovy.text.GStringTemplateEngine() + def tf = new File("${workflow.projectDir}/assets/email_template.txt") + def txt_template = engine.createTemplate(tf).make(email_fields) + def email_txt = txt_template.toString() + + // Render the HTML template + def hf = new File("${workflow.projectDir}/assets/email_template.html") + def html_template = engine.createTemplate(hf).make(email_fields) + def email_html = html_template.toString() + + // Render the sendmail template + def max_multiqc_email_size = (params.containsKey('max_multiqc_email_size') ? params.max_multiqc_email_size : 0) as MemoryUnit + def smail_fields = [email: email_address, subject: subject, email_txt: email_txt, email_html: email_html, projectDir: "${workflow.projectDir}", mqcFile: mqc_report, mqcMaxSize: max_multiqc_email_size.toBytes()] + def sf = new File("${workflow.projectDir}/assets/sendmail_template.txt") + def sendmail_template = engine.createTemplate(sf).make(smail_fields) + def sendmail_html = sendmail_template.toString() + + // Send the HTML e-mail + def colors = logColours(monochrome_logs) as Map + if (email_address) { + try { + if (plaintext_email) { + new org.codehaus.groovy.GroovyException('Send plaintext e-mail, not HTML') + } + // Try to send HTML e-mail using sendmail + def sendmail_tf = new File(workflow.launchDir.toString(), ".sendmail_tmp.html") + sendmail_tf.withWriter { w -> w << sendmail_html } + ['sendmail', '-t'].execute() << sendmail_html + log.info("-${colors.purple}[${workflow.manifest.name}]${colors.green} Sent summary e-mail to ${email_address} (sendmail)-") + } + catch (Exception msg) { + log.debug(msg.toString()) + log.debug("Trying with mail instead of sendmail") + // Catch failures and try with plaintext + def mail_cmd = ['mail', '-s', subject, '--content-type=text/html', email_address] + mail_cmd.execute() << email_html + log.info("-${colors.purple}[${workflow.manifest.name}]${colors.green} Sent summary e-mail to ${email_address} (mail)-") + } + } + + // Write summary e-mail HTML to a file + def output_hf = new File(workflow.launchDir.toString(), ".pipeline_report.html") + output_hf.withWriter { w -> w << email_html } + nextflow.extension.FilesEx.copyTo(output_hf.toPath(), "${outdir}/pipeline_info/pipeline_report.html") + output_hf.delete() + + // Write summary e-mail TXT to a file + def output_tf = new File(workflow.launchDir.toString(), ".pipeline_report.txt") + output_tf.withWriter { w -> w << email_txt } + nextflow.extension.FilesEx.copyTo(output_tf.toPath(), "${outdir}/pipeline_info/pipeline_report.txt") + output_tf.delete() +} + +// +// Print pipeline summary on completion +// +def completionSummary(monochrome_logs=true) { + def colors = logColours(monochrome_logs) as Map + if (workflow.success) { + if (workflow.stats.ignoredCount == 0) { + log.info("-${colors.purple}[${workflow.manifest.name}]${colors.green} Pipeline completed successfully${colors.reset}-") + } + else { + log.info("-${colors.purple}[${workflow.manifest.name}]${colors.yellow} Pipeline completed successfully, but with errored process(es) ${colors.reset}-") + } + } + else { + log.info("-${colors.purple}[${workflow.manifest.name}]${colors.red} Pipeline completed with errors${colors.reset}-") + } +} + +// +// Construct and send a notification to a web server as JSON e.g. Microsoft Teams and Slack +// +def imNotification(summary_params, hook_url) { + def summary = [:] + summary_params + .keySet() + .sort() + .each { group -> + summary << summary_params[group] + } + + def misc_fields = [:] + misc_fields['start'] = workflow.start + misc_fields['complete'] = workflow.complete + misc_fields['scriptfile'] = workflow.scriptFile + misc_fields['scriptid'] = workflow.scriptId + if (workflow.repository) { + misc_fields['repository'] = workflow.repository + } + if (workflow.commitId) { + misc_fields['commitid'] = workflow.commitId + } + if (workflow.revision) { + misc_fields['revision'] = workflow.revision + } + misc_fields['nxf_version'] = workflow.nextflow.version + misc_fields['nxf_build'] = workflow.nextflow.build + misc_fields['nxf_timestamp'] = workflow.nextflow.timestamp + + def msg_fields = [:] + msg_fields['version'] = getWorkflowVersion() + msg_fields['runName'] = workflow.runName + msg_fields['success'] = workflow.success + msg_fields['dateComplete'] = workflow.complete + msg_fields['duration'] = workflow.duration + msg_fields['exitStatus'] = workflow.exitStatus + msg_fields['errorMessage'] = (workflow.errorMessage ?: 'None') + msg_fields['errorReport'] = (workflow.errorReport ?: 'None') + msg_fields['commandLine'] = workflow.commandLine.replaceFirst(/ +--hook_url +[^ ]+/, "") + msg_fields['projectDir'] = workflow.projectDir + msg_fields['summary'] = summary << misc_fields + + // Render the JSON template + def engine = new groovy.text.GStringTemplateEngine() + // Different JSON depending on the service provider + // Defaults to "Adaptive Cards" (https://adaptivecards.io), except Slack which has its own format + def json_path = hook_url.contains("hooks.slack.com") ? "slackreport.json" : "adaptivecard.json" + def hf = new File("${workflow.projectDir}/assets/${json_path}") + def json_template = engine.createTemplate(hf).make(msg_fields) + def json_message = json_template.toString() + + // POST + def post = new URL(hook_url).openConnection() + post.setRequestMethod("POST") + post.setDoOutput(true) + post.setRequestProperty("Content-Type", "application/json") + post.getOutputStream().write(json_message.getBytes("UTF-8")) + def postRC = post.getResponseCode() + if (!postRC.equals(200)) { + log.warn(post.getErrorStream().getText()) + } +} diff --git a/hello-nf-core/solutions/core-hello-part5/subworkflows/nf-core/utils_nfcore_pipeline/meta.yml b/hello-nf-core/solutions/core-hello-part5/subworkflows/nf-core/utils_nfcore_pipeline/meta.yml new file mode 100644 index 0000000000..d08d24342d --- /dev/null +++ b/hello-nf-core/solutions/core-hello-part5/subworkflows/nf-core/utils_nfcore_pipeline/meta.yml @@ -0,0 +1,24 @@ +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/subworkflows/yaml-schema.json +name: "UTILS_NFCORE_PIPELINE" +description: Subworkflow with utility functions specific to the nf-core pipeline template +keywords: + - utility + - pipeline + - initialise + - version +components: [] +input: + - nextflow_cli_args: + type: list + description: | + Nextflow CLI positional arguments +output: + - success: + type: boolean + description: | + Dummy output to indicate success +authors: + - "@adamrtalbot" +maintainers: + - "@adamrtalbot" + - "@maxulysse" diff --git a/hello-nf-core/solutions/core-hello-part5/subworkflows/nf-core/utils_nfcore_pipeline/tests/main.function.nf.test b/hello-nf-core/solutions/core-hello-part5/subworkflows/nf-core/utils_nfcore_pipeline/tests/main.function.nf.test new file mode 100644 index 0000000000..f117040cbd --- /dev/null +++ b/hello-nf-core/solutions/core-hello-part5/subworkflows/nf-core/utils_nfcore_pipeline/tests/main.function.nf.test @@ -0,0 +1,126 @@ + +nextflow_function { + + name "Test Functions" + script "../main.nf" + config "subworkflows/nf-core/utils_nfcore_pipeline/tests/nextflow.config" + tag "subworkflows" + tag "subworkflows_nfcore" + tag "utils_nfcore_pipeline" + tag "subworkflows/utils_nfcore_pipeline" + + test("Test Function checkConfigProvided") { + + function "checkConfigProvided" + + then { + assertAll( + { assert function.success }, + { assert snapshot(function.result).match() } + ) + } + } + + test("Test Function checkProfileProvided") { + + function "checkProfileProvided" + + when { + function { + """ + input[0] = [] + """ + } + } + + then { + assertAll( + { assert function.success }, + { assert snapshot(function.result).match() } + ) + } + } + + test("Test Function without logColours") { + + function "logColours" + + when { + function { + """ + input[0] = true + """ + } + } + + then { + assertAll( + { assert function.success }, + { assert snapshot(function.result).match() } + ) + } + } + + test("Test Function with logColours") { + function "logColours" + + when { + function { + """ + input[0] = false + """ + } + } + + then { + assertAll( + { assert function.success }, + { assert snapshot(function.result).match() } + ) + } + } + + test("Test Function getSingleReport with a single file") { + function "getSingleReport" + + when { + function { + """ + input[0] = file(params.modules_testdata_base_path + '/generic/tsv/test.tsv', checkIfExists: true) + """ + } + } + + then { + assertAll( + { assert function.success }, + { assert function.result.contains("test.tsv") } + ) + } + } + + test("Test Function getSingleReport with multiple files") { + function "getSingleReport" + + when { + function { + """ + input[0] = [ + file(params.modules_testdata_base_path + '/generic/tsv/test.tsv', checkIfExists: true), + file(params.modules_testdata_base_path + '/generic/tsv/network.tsv', checkIfExists: true), + file(params.modules_testdata_base_path + '/generic/tsv/expression.tsv', checkIfExists: true) + ] + """ + } + } + + then { + assertAll( + { assert function.success }, + { assert function.result.contains("test.tsv") }, + { assert !function.result.contains("network.tsv") }, + { assert !function.result.contains("expression.tsv") } + ) + } + } +} diff --git a/hello-nf-core/solutions/core-hello-part5/subworkflows/nf-core/utils_nfcore_pipeline/tests/main.function.nf.test.snap b/hello-nf-core/solutions/core-hello-part5/subworkflows/nf-core/utils_nfcore_pipeline/tests/main.function.nf.test.snap new file mode 100644 index 0000000000..b13b311213 --- /dev/null +++ b/hello-nf-core/solutions/core-hello-part5/subworkflows/nf-core/utils_nfcore_pipeline/tests/main.function.nf.test.snap @@ -0,0 +1,136 @@ +{ + "Test Function checkProfileProvided": { + "content": null, + "meta": { + "nf-test": "0.8.4", + "nextflow": "23.10.1" + }, + "timestamp": "2024-02-28T12:03:03.360873" + }, + "Test Function checkConfigProvided": { + "content": [ + true + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "23.10.1" + }, + "timestamp": "2024-02-28T12:02:59.729647" + }, + "Test Function without logColours": { + "content": [ + { + "reset": "", + "bold": "", + "dim": "", + "underlined": "", + "blink": "", + "reverse": "", + "hidden": "", + "black": "", + "red": "", + "green": "", + "yellow": "", + "blue": "", + "purple": "", + "cyan": "", + "white": "", + "bblack": "", + "bred": "", + "bgreen": "", + "byellow": "", + "bblue": "", + "bpurple": "", + "bcyan": "", + "bwhite": "", + "ublack": "", + "ured": "", + "ugreen": "", + "uyellow": "", + "ublue": "", + "upurple": "", + "ucyan": "", + "uwhite": "", + "iblack": "", + "ired": "", + "igreen": "", + "iyellow": "", + "iblue": "", + "ipurple": "", + "icyan": "", + "iwhite": "", + "biblack": "", + "bired": "", + "bigreen": "", + "biyellow": "", + "biblue": "", + "bipurple": "", + "bicyan": "", + "biwhite": "" + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "23.10.1" + }, + "timestamp": "2024-02-28T12:03:17.969323" + }, + "Test Function with logColours": { + "content": [ + { + "reset": "\u001b[0m", + "bold": "\u001b[1m", + "dim": "\u001b[2m", + "underlined": "\u001b[4m", + "blink": "\u001b[5m", + "reverse": "\u001b[7m", + "hidden": "\u001b[8m", + "black": "\u001b[0;30m", + "red": "\u001b[0;31m", + "green": "\u001b[0;32m", + "yellow": "\u001b[0;33m", + "blue": "\u001b[0;34m", + "purple": "\u001b[0;35m", + "cyan": "\u001b[0;36m", + "white": "\u001b[0;37m", + "bblack": "\u001b[1;30m", + "bred": "\u001b[1;31m", + "bgreen": "\u001b[1;32m", + "byellow": "\u001b[1;33m", + "bblue": "\u001b[1;34m", + "bpurple": "\u001b[1;35m", + "bcyan": "\u001b[1;36m", + "bwhite": "\u001b[1;37m", + "ublack": "\u001b[4;30m", + "ured": "\u001b[4;31m", + "ugreen": "\u001b[4;32m", + "uyellow": "\u001b[4;33m", + "ublue": "\u001b[4;34m", + "upurple": "\u001b[4;35m", + "ucyan": "\u001b[4;36m", + "uwhite": "\u001b[4;37m", + "iblack": "\u001b[0;90m", + "ired": "\u001b[0;91m", + "igreen": "\u001b[0;92m", + "iyellow": "\u001b[0;93m", + "iblue": "\u001b[0;94m", + "ipurple": "\u001b[0;95m", + "icyan": "\u001b[0;96m", + "iwhite": "\u001b[0;97m", + "biblack": "\u001b[1;90m", + "bired": "\u001b[1;91m", + "bigreen": "\u001b[1;92m", + "biyellow": "\u001b[1;93m", + "biblue": "\u001b[1;94m", + "bipurple": "\u001b[1;95m", + "bicyan": "\u001b[1;96m", + "biwhite": "\u001b[1;97m" + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "23.10.1" + }, + "timestamp": "2024-02-28T12:03:21.714424" + } +} diff --git a/hello-nf-core/solutions/core-hello-part5/subworkflows/nf-core/utils_nfcore_pipeline/tests/main.workflow.nf.test b/hello-nf-core/solutions/core-hello-part5/subworkflows/nf-core/utils_nfcore_pipeline/tests/main.workflow.nf.test new file mode 100644 index 0000000000..8940d32d1e --- /dev/null +++ b/hello-nf-core/solutions/core-hello-part5/subworkflows/nf-core/utils_nfcore_pipeline/tests/main.workflow.nf.test @@ -0,0 +1,29 @@ +nextflow_workflow { + + name "Test Workflow UTILS_NFCORE_PIPELINE" + script "../main.nf" + config "subworkflows/nf-core/utils_nfcore_pipeline/tests/nextflow.config" + workflow "UTILS_NFCORE_PIPELINE" + tag "subworkflows" + tag "subworkflows_nfcore" + tag "utils_nfcore_pipeline" + tag "subworkflows/utils_nfcore_pipeline" + + test("Should run without failures") { + + when { + workflow { + """ + input[0] = [] + """ + } + } + + then { + assertAll( + { assert workflow.success }, + { assert snapshot(workflow.out).match() } + ) + } + } +} diff --git a/hello-nf-core/solutions/core-hello-part5/subworkflows/nf-core/utils_nfcore_pipeline/tests/main.workflow.nf.test.snap b/hello-nf-core/solutions/core-hello-part5/subworkflows/nf-core/utils_nfcore_pipeline/tests/main.workflow.nf.test.snap new file mode 100644 index 0000000000..84ee1e1d1e --- /dev/null +++ b/hello-nf-core/solutions/core-hello-part5/subworkflows/nf-core/utils_nfcore_pipeline/tests/main.workflow.nf.test.snap @@ -0,0 +1,19 @@ +{ + "Should run without failures": { + "content": [ + { + "0": [ + true + ], + "valid_config": [ + true + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "23.10.1" + }, + "timestamp": "2024-02-28T12:03:25.726491" + } +} diff --git a/hello-nf-core/solutions/core-hello-part5/subworkflows/nf-core/utils_nfcore_pipeline/tests/nextflow.config b/hello-nf-core/solutions/core-hello-part5/subworkflows/nf-core/utils_nfcore_pipeline/tests/nextflow.config new file mode 100644 index 0000000000..d0a926bf6d --- /dev/null +++ b/hello-nf-core/solutions/core-hello-part5/subworkflows/nf-core/utils_nfcore_pipeline/tests/nextflow.config @@ -0,0 +1,9 @@ +manifest { + name = 'nextflow_workflow' + author = """nf-core""" + homePage = 'https://127.0.0.1' + description = """Dummy pipeline""" + nextflowVersion = '!>=23.04.0' + version = '9.9.9' + doi = 'https://doi.org/10.5281/zenodo.5070524' +} diff --git a/hello-nf-core/solutions/core-hello-part5/subworkflows/nf-core/utils_nfschema_plugin/main.nf b/hello-nf-core/solutions/core-hello-part5/subworkflows/nf-core/utils_nfschema_plugin/main.nf new file mode 100644 index 0000000000..acb3972419 --- /dev/null +++ b/hello-nf-core/solutions/core-hello-part5/subworkflows/nf-core/utils_nfschema_plugin/main.nf @@ -0,0 +1,73 @@ +// +// Subworkflow that uses the nf-schema plugin to validate parameters and render the parameter summary +// + +include { paramsSummaryLog } from 'plugin/nf-schema' +include { validateParameters } from 'plugin/nf-schema' +include { paramsHelp } from 'plugin/nf-schema' + +workflow UTILS_NFSCHEMA_PLUGIN { + + take: + input_workflow // workflow: the workflow object used by nf-schema to get metadata from the workflow + validate_params // boolean: validate the parameters + parameters_schema // string: path to the parameters JSON schema. + // this has to be the same as the schema given to `validation.parametersSchema` + // when this input is empty it will automatically use the configured schema or + // "${projectDir}/nextflow_schema.json" as default. This input should not be empty + // for meta pipelines + help // boolean: show help message + help_full // boolean: show full help message + show_hidden // boolean: show hidden parameters in help message + before_text // string: text to show before the help message and parameters summary + after_text // string: text to show after the help message and parameters summary + command // string: an example command of the pipeline + + main: + + if(help || help_full) { + help_options = [ + beforeText: before_text, + afterText: after_text, + command: command, + showHidden: show_hidden, + fullHelp: help_full, + ] + if(parameters_schema) { + help_options << [parametersSchema: parameters_schema] + } + log.info paramsHelp( + help_options, + params.help instanceof String ? params.help : "", + ) + exit 0 + } + + // + // Print parameter summary to stdout. This will display the parameters + // that differ from the default given in the JSON schema + // + + summary_options = [:] + if(parameters_schema) { + summary_options << [parametersSchema: parameters_schema] + } + log.info before_text + log.info paramsSummaryLog(summary_options, input_workflow) + log.info after_text + + // + // Validate the parameters using nextflow_schema.json or the schema + // given via the validation.parametersSchema configuration option + // + if(validate_params) { + validateOptions = [:] + if(parameters_schema) { + validateOptions << [parametersSchema: parameters_schema] + } + validateParameters(validateOptions) + } + + emit: + dummy_emit = true +} diff --git a/hello-nf-core/solutions/core-hello-part5/subworkflows/nf-core/utils_nfschema_plugin/meta.yml b/hello-nf-core/solutions/core-hello-part5/subworkflows/nf-core/utils_nfschema_plugin/meta.yml new file mode 100644 index 0000000000..f7d9f02885 --- /dev/null +++ b/hello-nf-core/solutions/core-hello-part5/subworkflows/nf-core/utils_nfschema_plugin/meta.yml @@ -0,0 +1,35 @@ +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/subworkflows/yaml-schema.json +name: "utils_nfschema_plugin" +description: Run nf-schema to validate parameters and create a summary of changed parameters +keywords: + - validation + - JSON schema + - plugin + - parameters + - summary +components: [] +input: + - input_workflow: + type: object + description: | + The workflow object of the used pipeline. + This object contains meta data used to create the params summary log + - validate_params: + type: boolean + description: Validate the parameters and error if invalid. + - parameters_schema: + type: string + description: | + Path to the parameters JSON schema. + This has to be the same as the schema given to the `validation.parametersSchema` config + option. When this input is empty it will automatically use the configured schema or + "${projectDir}/nextflow_schema.json" as default. The schema should not be given in this way + for meta pipelines. +output: + - dummy_emit: + type: boolean + description: Dummy emit to make nf-core subworkflows lint happy +authors: + - "@nvnieuwk" +maintainers: + - "@nvnieuwk" diff --git a/hello-nf-core/solutions/core-hello-part5/subworkflows/nf-core/utils_nfschema_plugin/tests/main.nf.test b/hello-nf-core/solutions/core-hello-part5/subworkflows/nf-core/utils_nfschema_plugin/tests/main.nf.test new file mode 100644 index 0000000000..c977917aac --- /dev/null +++ b/hello-nf-core/solutions/core-hello-part5/subworkflows/nf-core/utils_nfschema_plugin/tests/main.nf.test @@ -0,0 +1,173 @@ +nextflow_workflow { + + name "Test Subworkflow UTILS_NFSCHEMA_PLUGIN" + script "../main.nf" + workflow "UTILS_NFSCHEMA_PLUGIN" + + tag "subworkflows" + tag "subworkflows_nfcore" + tag "subworkflows/utils_nfschema_plugin" + tag "plugin/nf-schema" + + config "./nextflow.config" + + test("Should run nothing") { + + when { + + params { + test_data = '' + } + + workflow { + """ + validate_params = false + input[0] = workflow + input[1] = validate_params + input[2] = "" + input[3] = false + input[4] = false + input[5] = false + input[6] = "" + input[7] = "" + input[8] = "" + """ + } + } + + then { + assertAll( + { assert workflow.success } + ) + } + } + + test("Should validate params") { + + when { + + params { + test_data = '' + outdir = null + } + + workflow { + """ + validate_params = true + input[0] = workflow + input[1] = validate_params + input[2] = "" + input[3] = false + input[4] = false + input[5] = false + input[6] = "" + input[7] = "" + input[8] = "" + """ + } + } + + then { + assertAll( + { assert workflow.failed }, + { assert workflow.stdout.any { it.contains('ERROR ~ Validation of pipeline parameters failed!') } } + ) + } + } + + test("Should run nothing - custom schema") { + + when { + + params { + test_data = '' + } + + workflow { + """ + validate_params = false + input[0] = workflow + input[1] = validate_params + input[2] = "${projectDir}/subworkflows/nf-core/utils_nfschema_plugin/tests/nextflow_schema.json" + input[3] = false + input[4] = false + input[5] = false + input[6] = "" + input[7] = "" + input[8] = "" + """ + } + } + + then { + assertAll( + { assert workflow.success } + ) + } + } + + test("Should validate params - custom schema") { + + when { + + params { + test_data = '' + outdir = null + } + + workflow { + """ + validate_params = true + input[0] = workflow + input[1] = validate_params + input[2] = "${projectDir}/subworkflows/nf-core/utils_nfschema_plugin/tests/nextflow_schema.json" + input[3] = false + input[4] = false + input[5] = false + input[6] = "" + input[7] = "" + input[8] = "" + """ + } + } + + then { + assertAll( + { assert workflow.failed }, + { assert workflow.stdout.any { it.contains('ERROR ~ Validation of pipeline parameters failed!') } } + ) + } + } + + test("Should create a help message") { + + when { + + params { + test_data = '' + outdir = null + } + + workflow { + """ + validate_params = true + input[0] = workflow + input[1] = validate_params + input[2] = "${projectDir}/subworkflows/nf-core/utils_nfschema_plugin/tests/nextflow_schema.json" + input[3] = true + input[4] = false + input[5] = false + input[6] = "Before" + input[7] = "After" + input[8] = "nextflow run test/test" + """ + } + } + + then { + assertAll( + { assert workflow.success } + ) + } + } +} diff --git a/hello-nf-core/solutions/core-hello-part5/subworkflows/nf-core/utils_nfschema_plugin/tests/nextflow.config b/hello-nf-core/solutions/core-hello-part5/subworkflows/nf-core/utils_nfschema_plugin/tests/nextflow.config new file mode 100644 index 0000000000..8d8c73718a --- /dev/null +++ b/hello-nf-core/solutions/core-hello-part5/subworkflows/nf-core/utils_nfschema_plugin/tests/nextflow.config @@ -0,0 +1,8 @@ +plugins { + id "nf-schema@2.5.1" +} + +validation { + parametersSchema = "${projectDir}/subworkflows/nf-core/utils_nfschema_plugin/tests/nextflow_schema.json" + monochromeLogs = true +} diff --git a/hello-nf-core/solutions/core-hello-part5/subworkflows/nf-core/utils_nfschema_plugin/tests/nextflow_schema.json b/hello-nf-core/solutions/core-hello-part5/subworkflows/nf-core/utils_nfschema_plugin/tests/nextflow_schema.json new file mode 100644 index 0000000000..91e26fc4a7 --- /dev/null +++ b/hello-nf-core/solutions/core-hello-part5/subworkflows/nf-core/utils_nfschema_plugin/tests/nextflow_schema.json @@ -0,0 +1,103 @@ +{ + "$schema": "https://json-schema.org/draft/2020-12/schema", + "$id": "https://raw.githubusercontent.com/./master/nextflow_schema.json", + "title": ". pipeline parameters", + "description": "", + "type": "object", + "$defs": { + "input_output_options": { + "title": "Input/output options", + "type": "object", + "fa_icon": "fas fa-terminal", + "description": "Define where the pipeline should find input data and save output data.", + "required": ["outdir"], + "properties": { + "validate_params": { + "type": "boolean", + "description": "Validate parameters?", + "default": true, + "hidden": true + }, + "outdir": { + "type": "string", + "format": "directory-path", + "description": "The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure.", + "fa_icon": "fas fa-folder-open" + }, + "test_data_base": { + "type": "string", + "default": "https://raw.githubusercontent.com/nf-core/test-datasets/modules", + "description": "Base for test data directory", + "hidden": true + }, + "test_data": { + "type": "string", + "description": "Fake test data param", + "hidden": true + } + } + }, + "generic_options": { + "title": "Generic options", + "type": "object", + "fa_icon": "fas fa-file-import", + "description": "Less common options for the pipeline, typically set in a config file.", + "help_text": "These options are common to all nf-core pipelines and allow you to customise some of the core preferences for how the pipeline runs.\n\nTypically these options would be set in a Nextflow config file loaded for all pipeline runs, such as `~/.nextflow/config`.", + "properties": { + "help": { + "type": "boolean", + "description": "Display help text.", + "fa_icon": "fas fa-question-circle", + "hidden": true + }, + "version": { + "type": "boolean", + "description": "Display version and exit.", + "fa_icon": "fas fa-question-circle", + "hidden": true + }, + "logo": { + "type": "boolean", + "default": true, + "description": "Display nf-core logo in console output.", + "fa_icon": "fas fa-image", + "hidden": true + }, + "singularity_pull_docker_container": { + "type": "boolean", + "description": "Pull Singularity container from Docker?", + "hidden": true + }, + "publish_dir_mode": { + "type": "string", + "default": "copy", + "description": "Method used to save pipeline results to output directory.", + "help_text": "The Nextflow `publishDir` option specifies which intermediate files should be saved to the output directory. This option tells the pipeline what method should be used to move these files. See [Nextflow docs](https://www.nextflow.io/docs/latest/process.html#publishdir) for details.", + "fa_icon": "fas fa-copy", + "enum": [ + "symlink", + "rellink", + "link", + "copy", + "copyNoFollow", + "move" + ], + "hidden": true + }, + "monochrome_logs": { + "type": "boolean", + "description": "Use monochrome_logs", + "hidden": true + } + } + } + }, + "allOf": [ + { + "$ref": "#/$defs/input_output_options" + }, + { + "$ref": "#/$defs/generic_options" + } + ] +} diff --git a/hello-nf-core/solutions/core-hello-part5/workflows/hello.nf b/hello-nf-core/solutions/core-hello-part5/workflows/hello.nf new file mode 100644 index 0000000000..1993686608 --- /dev/null +++ b/hello-nf-core/solutions/core-hello-part5/workflows/hello.nf @@ -0,0 +1,67 @@ +/* +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + IMPORT MODULES / SUBWORKFLOWS / FUNCTIONS +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +*/ +include { paramsSummaryMap } from 'plugin/nf-schema' +include { softwareVersionsToYAML } from '../subworkflows/nf-core/utils_nfcore_pipeline' +include { sayHello } from '../modules/local/sayHello.nf' +include { convertToUpper } from '../modules/local/convertToUpper.nf' +include { cowpy } from '../modules/local/cowpy.nf' +include { CAT_CAT } from '../modules/nf-core/cat/cat/main' + +/* +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + RUN MAIN WORKFLOW +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +*/ + +workflow HELLO { + + take: + ch_samplesheet // channel: samplesheet read in from --input + + main: + + ch_versions = Channel.empty() + + // emit a greeting + sayHello(ch_samplesheet) + + // convert the greeting to uppercase + convertToUpper(sayHello.out) + + // create metadata map with batch name as the ID + def cat_meta = [ id: params.batch ] + // create a channel with metadata and files in tuple format + ch_for_cat = convertToUpper.out.collect().map { files -> tuple(cat_meta, files) } + + // concatenate files using the nf-core cat/cat module + CAT_CAT(ch_for_cat) + + // generate ASCII art of the greetings with cowpy + cowpy(CAT_CAT.out.file_out) + + // + // Collate and save software versions + // + softwareVersionsToYAML(ch_versions) + .collectFile( + storeDir: "${params.outdir}/pipeline_info", + name: 'hello_software_' + 'versions.yml', + sort: true, + newLine: true + ).set { ch_collated_versions } + + + emit: + cowpy_hellos = cowpy.out.cowpy_output + versions = ch_versions // channel: [ path(versions.yml) ] + +} + +/* +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + THE END +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +*/ diff --git a/mkdocs.yml b/mkdocs.yml index 3f27c7c0cf..4ea78ffc5d 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -33,8 +33,9 @@ nav: - hello_nf-core/01_run_demo.md - hello_nf-core/02_rewrite_hello.md - hello_nf-core/03_use_module.md - - hello_nf-core/04_adapt_module.md + - hello_nf-core/04_make_module.md - hello_nf-core/05_input_validation.md + - hello_nf-core/summary.md - hello_nf-core/survey.md - hello_nf-core/next_steps.md - Nextflow for Science: @@ -79,7 +80,6 @@ nav: - side_quests/debugging.md - side_quests/essential_scripting_patterns.md - side_quests/nf-test.md - - side_quests/nf-core.md - Archive: - Fundamentals Training: - archive/basic_training/index.md diff --git a/side-quests/nf-core/data/sequencer_samplesheet.csv b/side-quests/nf-core/data/sequencer_samplesheet.csv deleted file mode 100644 index f2c3b78051..0000000000 --- a/side-quests/nf-core/data/sequencer_samplesheet.csv +++ /dev/null @@ -1,5 +0,0 @@ -sample,sequencer,fastq_1,fastq_2 -SAMPLE1_PE,sequencer1,https://raw.githubusercontent.com/nf-core/test-datasets/viralrecon/illumina/amplicon/sample1_R1.fastq.gz,https://raw.githubusercontent.com/nf-core/test-datasets/viralrecon/illumina/amplicon/sample1_R2.fastq.gz -SAMPLE2_PE,sequencer2,https://raw.githubusercontent.com/nf-core/test-datasets/viralrecon/illumina/amplicon/sample2_R1.fastq.gz,https://raw.githubusercontent.com/nf-core/test-datasets/viralrecon/illumina/amplicon/sample2_R2.fastq.gz -SAMPLE3_SE,sequencer3,https://raw.githubusercontent.com/nf-core/test-datasets/viralrecon/illumina/amplicon/sample1_R1.fastq.gz, -SAMPLE3_SE,sequencer3,https://raw.githubusercontent.com/nf-core/test-datasets/viralrecon/illumina/amplicon/sample2_R1.fastq.gz,