You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/module.md
+14-15Lines changed: 14 additions & 15 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -186,7 +186,7 @@ Ciao world!
186
186
187
187
Process script {ref}`templates <process-template>` can be included alongside a module in the `templates` directory.
188
188
189
-
For example, suppose we have a project L with a module that defines two processes, P1 and P2, both of which use templates. The template files can be made available in the local `templates` directory:
189
+
For example, Project L contains a module (`myModules.nf`) that defines two processes, P1 and P2. Both processes use templates that are available in the local `templates` directory:
190
190
191
191
```
192
192
Project L
@@ -196,29 +196,29 @@ Project L
196
196
└── P2-template.sh
197
197
```
198
198
199
-
Then, we have a second project A with a workflow that includes P1 and P2:
199
+
Projects A contains a workflow that includes processes P1 and P2:
200
200
201
201
```
202
-
Pipeline A
202
+
Project A
203
203
└── main.nf
204
204
```
205
205
206
-
Finally, we have a third project B with a workflow that also includes P1 and P2:
206
+
Pipeline B contains a workflow that also includes process P1 and P2:
207
207
208
208
```
209
-
Pipeline B
209
+
Project B
210
210
└── main.nf
211
211
```
212
212
213
-
With the possibility to keep the template files inside the project L, A and B can use the modules defined in L without any changes. A future project C would do the same, just cloning L (if not available on the system) and including its module.
213
+
As the template files are stored with the modules inside the Project L, Projects A and B can include them without any changing any code. Future projects would also be able to include these modules by cloning Project L and including its module (if they were not available on the system).
214
214
215
-
Beside promoting the sharing of modules across pipelines, there are several advantages to keeping the module template under the script path:
215
+
Keeping the module template within the script path has several advantages beyond facilitating module sharing across pipelines:
216
216
217
217
1. Modules are self-contained
218
218
2. Modules can be tested independently from the pipeline(s) that import them
219
219
3. Modules can be made into libraries
220
220
221
-
Having multiple template locations enables a structured project organization. If a project has several modules, and they all use templates, the project could group module scripts and their templates as needed. For example:
221
+
Organizing templates locations allows for a well-structured project. In projects with multiple modules that rely on templates, you can organize module scripts and their corresponding templates into logical groups. For example:
222
222
223
223
```
224
224
baseDir
@@ -240,10 +240,11 @@ baseDir
240
240
|── mymodules6.nf
241
241
└── templates
242
242
|── P5-template.sh
243
-
|── P6-template.sh
244
-
└── P7-template.sh
243
+
└── P6-template.sh
245
244
```
246
245
246
+
See {ref}`process-template` for more information about how to externalize process scripts to template files.
247
+
247
248
(module-binaries)=
248
249
249
250
## Module binaries
@@ -253,13 +254,13 @@ baseDir
253
254
254
255
Modules can define binary scripts that are locally scoped to the processes defined by the tasks.
255
256
256
-
To enable this feature, set the following flag in your pipeline script or configuration file:
257
+
To use this feature, the module binaries must be enabled in your pipeline script or configuration file:
257
258
258
259
```nextflow
259
260
nextflow.enable.moduleBinaries = true
260
261
```
261
262
262
-
The binary scripts must be placed in the module directory names`<module-dir>/resources/usr/bin`:
263
+
Binary scripts must be placed in the module directory named`<module-dir>/resources/usr/bin` and granted execution permissions:
263
264
264
265
```
265
266
<module-dir>
@@ -271,10 +272,8 @@ The binary scripts must be placed in the module directory names `<module-dir>/re
271
272
└── another-module-script2.py
272
273
```
273
274
274
-
Those scripts will be made accessible like any other command in the task environment, provided they have been granted the Linux execute permissions.
275
-
276
275
:::{note}
277
-
This feature requires the use of a local or shared file system for the pipeline work directory, or {ref}`wave-page` when using cloud-based executors.
276
+
Module binary scripts require a local or shared file system for the pipeline work directory, or {ref}`wave-page` when using cloud-based executors.
Copy file name to clipboardExpand all lines: docs/process.md
+37-34Lines changed: 37 additions & 34 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -24,11 +24,11 @@ See {ref}`syntax-process` for a full description of the process syntax.
24
24
25
25
## Script
26
26
27
-
The `script` block defines, as a string expression, the script that is executed by the process.
27
+
The `script` block defines the string expression that is executed by the process.
28
28
29
-
A process may contain only one script, and if the `script` guard is not explicitly declared, the script must be the final statement in the process block.
29
+
The process can contain only one script block. If the `script` guard is not explicitly declared it must be the final statement in the process block.
30
30
31
-
The script string is executed as a [Bash](<http://en.wikipedia.org/wiki/Bash_(Unix_shell)>) script in the host environment. It can be any command or script that you would normally execute on the command line or in a Bash script. Naturally, the script may only use commands that are available in the host environment.
31
+
The script string is executed as a [Bash](<http://en.wikipedia.org/wiki/Bash_(Unix_shell)>) script in the host environment. It can be any command or script that you would execute on the command line or in a Bash script and can only use commands that are available in the host environment.
32
32
33
33
The script block can be a simple string or a multi-line string. The latter approach makes it easier to write scripts with multiple commands spanning multiple lines. For example:
34
34
@@ -42,19 +42,17 @@ process doMoreThings {
42
42
}
43
43
```
44
44
45
-
As explained in the script tutorial section, strings can be defined using single-quotes or double-quotes, and multi-line strings are defined by three single-quote or three double-quote characters.
45
+
Strings can be defined using single-quotes or double-quotes. Multi-line strings are defined by three single-quote or three double-quote characters.
46
46
47
-
There is a subtle but important difference between them. Like in Bash, strings delimited by a`"` character support variable substitutions, while strings delimited by `'` do not.
47
+
There is a subtle but important difference between single-quote (`'`) or three double-quote (`"`) characters. Like in Bash, strings delimited by the`"` character support variable substitutions, while strings delimited by `'` do not.
48
48
49
-
In the above code fragment, the `$db` variable is replaced by the actual value defined elsewhere in the pipeline script.
49
+
For example, in the above code fragment, the `$db` variable is replaced by the actual value defined elsewhere in the pipeline script.
50
50
51
51
:::{warning}
52
-
Since Nextflow uses the same Bash syntax for variable substitutions in strings, you must manage them carefully depending on whether you want to evaluate a *Nextflow* variable or a *Bash* variable.
52
+
Nextflow uses the same Bash syntax for variable substitutions in strings. You must manage them carefully depending on whether you want to evaluate a *Nextflow* variable or a *Bash* variable.
53
53
:::
54
54
55
-
When you need to access a system environment variable in your script, you have two options.
56
-
57
-
If you don't need to access any Nextflow variables, you can define your script block with single-quotes:
55
+
System environment variables and Nextflow variables can be accessed by your script. If you don't need to access any Nextflow variables, you can define your script block with single-quotes and use the dollar character (`$`) to access system environment variables. For example:
58
56
59
57
```nextflow
60
58
process printPath {
@@ -64,7 +62,7 @@ process printPath {
64
62
}
65
63
```
66
64
67
-
Otherwise, you can define your script with double-quotes and escape the system environment variables by prefixing them with a back-slash `\` character, as shown in the following example:
65
+
Otherwise, you can define your script with double-quotes and escape the system environment variables by prefixing them with a back-slash `\` character. For example:
68
66
69
67
```nextflow
70
68
process doOtherThings {
@@ -76,21 +74,17 @@ process doOtherThings {
76
74
}
77
75
```
78
76
79
-
In this example, `$MAX` is a Nextflow variable that must be defined elsewhere in the pipeline script. Nextflow replaces it with the actual value before executing the script. Meanwhile, `$DB` is a Bash variable that must exist in the execution environment, and Bash will replace it with the actual value during execution.
80
-
81
-
:::{tip}
82
-
Alternatively, you can use the {ref}`process-shell` block definition, which allows a script to contain both Bash and Nextflow variables without having to escape the first.
83
-
:::
77
+
In this example, `$MAX` is a Nextflow variable that is defined elsewhere in the pipeline script. Nextflow replaces it with the actual value before executing the script. In contrast, `$DB` is a Bash variable that must exist in the execution environment. Bash will replace it with the actual value during execution.
84
78
85
79
### Scripts *à la carte*
86
80
87
-
The process script is interpreted by Nextflow as a Bash script by default, but you are not limited to Bash.
81
+
The process script is interpreted as Bash by default.
88
82
89
-
You can use your favourite scripting language (Perl, Python, R, etc), or even mix them in the same pipeline.
83
+
However, you can use your favorite scripting language (Perl, Python, R, etc) for each process. You can also mix languages in the same pipeline.
90
84
91
-
A pipeline may be composed of processes that execute very different tasks. With Nextflow, you can choose the scripting language that best fits the task performed by a given process. For example, for some processes R might be more useful than Perl, whereas for others you may need to use Python because it provides better access to a library or an API, etc.
85
+
A pipeline may be composed of processes that execute very different tasks. You can choose the scripting language that best fits the task performed by a given process. For example, R might be more useful than Perl for some processes, whereas for others you may need to use Python because it provides better access to a library or an API.
92
86
93
-
To use a language other than Bash, simply start your process script with the corresponding [shebang](<http://en.wikipedia.org/wiki/Shebang_(Unix)>). For example:
87
+
To use a language other than Bash, start your process script with the corresponding [shebang](<http://en.wikipedia.org/wiki/Shebang_(Unix)>). For example:
94
88
95
89
```nextflow
96
90
process perlTask {
@@ -118,12 +112,17 @@ workflow {
118
112
```
119
113
120
114
:::{tip}
121
-
Since the actual location of the interpreter binary file can differ across platforms, it is wise to use the `env` command followed by the interpreter name, e.g. `#!/usr/bin/env perl`, instead of the absolute path, in order to make your script more portable.
115
+
As the location of the interpreter binary file can differ across platforms. Use the `env` command followed by the interpreter name to make your script more portable. For example:
116
+
117
+
```nextflow
118
+
#!/usr/bin/env perl
119
+
```
120
+
122
121
:::
123
122
124
123
### Conditional scripts
125
124
126
-
The `script` block is like a function that returns a string. This means that you can write arbitrary code to determine the script, as long as the final statement is a string.
125
+
The `script` block is like a function that returns a string. You can write arbitrary code to determine the script as long as the final statement is a string.
127
126
128
127
If-else statements based on task inputs can be used to produce a different script. For example:
129
128
@@ -155,15 +154,13 @@ process align {
155
154
}
156
155
```
157
156
158
-
In the above example, the process will execute one of several scripts depending on the value of the `mode` parameter. By default it will execute the `tcoffee` command.
157
+
In the above example, the process will execute one of several scripts depending on the value of the `mode` parameter. By default, the process will execute the `tcoffee` command.
159
158
160
159
(process-template)=
161
160
162
161
### Template
163
162
164
-
Process scripts can be externalized to **template** files, which allows them to be reused across different processes and tested independently from the pipeline execution.
165
-
166
-
A template can be used in place of an embedded script using the `template` function in the script section:
163
+
Process scripts can be externalized to **template** files and accessed using the `template` function in the script section. For example:
167
164
168
165
```nextflow
169
166
process templateExample {
@@ -179,9 +176,9 @@ workflow {
179
176
}
180
177
```
181
178
182
-
By default, Nextflow looks for the template script in the `templates` directory located alongside the Nextflow script in which the process is defined. An absolute path can be used to specify a different location. However, this practice is discouraged because it hinders pipeline portability.
179
+
By default, Nextflow looks for template scripts in the `templates` directory, located alongside the Nextflow script that defines the process. A template can be reused across multiple processes. An absolute path can be used to specify a different template location. However, this practice is discouraged because it hinders pipeline portability.
183
180
184
-
An example template script is provided below:
181
+
Templates can be tested independently of pipeline execution. Consider the following template script:
185
182
186
183
```bash
187
184
#!/bin/bash
@@ -190,22 +187,28 @@ echo $STR
190
187
echo"process completed"
191
188
```
192
189
193
-
Variables prefixed with the dollar character (`$`) are interpreted as Nextflow variables when the template script is executed by Nextflow and Bash variables when executed directly. For example, the above script can be executed from the command line by providing each input as an environment variable:
190
+
The above script can be executed from the command line by providing each input as an environment variable.
194
191
195
192
```bash
196
193
STR='foo' bash templates/my_script.sh
197
194
```
198
195
199
-
The following caveats should be considered:
196
+
Variables prefixed with the dollar character (`$`) are interpreted as Nextflow variables when the template script is executed by Nextflow and Bash variables when executed directly.
197
+
198
+
The following caveats should be considered when using templates:
199
+
200
+
- Template scripts are only recommended for Bash scripts.
201
+
202
+
- Languages that do not prefix variables with `$` (e.g. Python and R) can't be executed directly as a template script from the command line.
200
203
201
-
- Template scripts are recommended only for Bash scripts. Languages that do not prefix variables with `$`(e.g. Python and R) can't be executed directly as a template script.
204
+
- Template variables escaped with `\$`will be interpreted as Bash variables when executed by Nextflow but not the command line.
202
205
203
-
-Variables escaped with `\$` will be interpreted as Bash variables when executed by Nextflow, but will not be interpreted as variables when executed from the command line. This practice should be avoided to ensure that the template script behaves consistently.
206
+
-Template variables are evaluated even if they are commented out in the template script.
204
207
205
-
-Template variables are evaluated even if they are commented out in the template script. If a template variable is missing, it will cause the pipeline to fail regardless of where it occurs in the template.
208
+
-The pipeline to fail if a template variable is missing, regardless of where it occurs in the template.
206
209
207
210
:::{tip}
208
-
Template scripts are generally discouraged due to the caveats described above. The best practice for using a custom script is to embed it in the process definition at first and move it to a separate file with its own command line interface once the code matures.
211
+
The best practice for using a custom script is to first embed it in the process definition and transfer it to a separate file with its own command line interface once the code matures.
0 commit comments