You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/aws.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -281,7 +281,7 @@ There are several reasons why you might need to create your own [AMI (Amazon Mac
281
281
### Create your custom AMI
282
282
283
283
From the EC2 Dashboard, select **Launch Instance**, then select **Browse more AMIs**. In the new page, select
284
-
**AWS Marketplace AMIs**, and then search for **Amazon ECS-Optimized Amazon Linux 2 (AL2) x86_64 AMI**. Select the AMI and continue as usual to configure and launch the instance.
284
+
**AWS Marketplace AMIs**, and then search for `Amazon ECS-Optimized Amazon Linux 2 (AL2) x86_64 AMI`. Select the AMI and continue as usual to configure and launch the instance.
285
285
286
286
:::{note}
287
287
The selected instance has a root volume of 30GB. Make sure to increase its size or add a second EBS volume with enough storage for real genomic workloads.
Copy file name to clipboardExpand all lines: docs/cache-and-resume.md
+49-38Lines changed: 49 additions & 38 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -72,75 +72,86 @@ For this reason, it is important to preserve both the task cache (`.nextflow/cac
72
72
73
73
## Troubleshooting
74
74
75
-
Cache failures happen when either (1) a task that was supposed to be cached was re-executed, or (2) a task that was supposed to be re-executed was cached.
75
+
Cache failures occur when a task that was supposed to be cached was re-executed or a task that was supposed to be re-executed was cached.
76
76
77
-
When this happens, consider the following questions:
77
+
Common causes of cache failures include:
78
78
79
-
- Is resume enabled via `-resume`?
80
-
- Is the {ref}`process-cache` directive set to a non-default value?
81
-
- Is the task still present in the task cache and work directory?
-[Race condition on a global variable](#race-condition-on-a-global-variable)
84
+
-[Non-deterministic process inputs](#non-deterministic-process-inputs)
83
85
84
-
Changing any of the inputs included in the [task hash](#task-hash) will invalidate the cache, for example:
86
+
### Resume not enabled
85
87
86
-
- Resuming from a different session ID
87
-
- Changing the process name
88
-
- Changing the task container image or Conda environment
89
-
- Changing the task script
90
-
- Changing an input file or bundled script used by the task
88
+
The `-resume` option is required to resume a pipeline. Ensure you enable `-resume` in your run command or your Nextflow configuration file.
89
+
90
+
### Cache directive disabled
91
91
92
-
While the following examples would not invalidate the cache:
92
+
The `cache` directive is enabled by default. However, you can disable or modify its behavior for a specific process. For example:
93
93
94
-
- Changing the value of a directive (other than {ref}`process-ext`), even if that directive is used in the task script
94
+
```nextflow
95
+
process FOO {
96
+
cache false
97
+
// ...
98
+
}
99
+
```
95
100
96
-
In many cases, cache failures happen because of a change to the pipeline script or configuration, or because the pipeline itself has some non-deterministic behavior.
101
+
Ensure that the `cache` directive has not been disabled. See {ref}`process-cache` for more information.
97
102
98
-
Here are some common reasons for cache failures:
103
+
### Modified inputs
99
104
100
-
### Modified input files
105
+
Modifying inputs that are used in the task hash invalidates the cache. Common causes of modified inputs include:
101
106
102
-
Make sure that your input files have not been changed. Keep in mind that the default caching mode uses the complete file path, the last modified timestamp, and the file size. If any of these attributes change, the task will be re-executed, even if the file content is unchanged.
107
+
- Changing input files
108
+
- Resuming from a different session ID
109
+
- Changing the process name
110
+
- Changing the calling workflow name
111
+
- Changing the task container image or Conda environment
112
+
- Changing the task script
113
+
- Changing a bundled script used by the task
103
114
104
-
### Process that modifies its inputs
115
+
Nextflow calculates a hash for an input file using its full path, last modified timestamp, and file size. If any of these attributes change, Nextflow re-executes the task.
105
116
106
-
If a process modifies its own input files, it cannot be resumed for the reasons described in the previous point. As a result, processes that modify their own input files are considered an anti-pattern and should be avoided.
117
+
:::{warning}
118
+
If a process modifies its input files, it cannot be resumed. Avoid processes that modify their own input files as this is considered an anti-pattern.
119
+
:::
107
120
108
121
### Inconsistent file attributes
109
122
110
-
Some shared file systems, such as NFS, may report inconsistent file timestamps, which can invalidate the cache. If you encounter this problem, you can avoid it by using the `'lenient'` {ref}`caching mode <process-cache>`, which ignores the last modified timestamp and uses only the file path and size.
123
+
Some shared file systems, such as NFS, may report inconsistent file timestamps, which can invalidate the cache when using the standard caching mode.
124
+
125
+
To resolve this issue, use the `'lenient'` {ref}`caching mode <process-cache>` to ignore the last modified timestamp and use only the file path and size.
111
126
112
127
(cache-global-var-race-condition)=
113
128
114
129
### Race condition on a global variable
115
130
116
-
While Nextflow tries to make it easy to write safe concurrent code, it is still possible to create race conditions, which can in turn impact the caching behavior of your pipeline.
117
-
118
-
Consider the following example:
131
+
Race conditions can disrupt the caching behavior of your pipeline. For example:
119
132
120
133
```nextflow
121
-
channel.of(1,2,3) | map { v -> X=v; X+=2 } | view { v -> "ch1 = $v" }
122
-
channel.of(1,2,3) | map { v -> X=v; X*=2 } | view { v -> "ch2 = $v" }
134
+
channel.of(1,2,3).map { v -> X=v; X+=2 }.view { v -> "ch1 = $v" }
135
+
channel.of(1,2,3).map { v -> X=v; X*=2 }.view { v -> "ch2 = $v" }
123
136
```
124
137
125
-
The problem here is that `X` is declared in each `map` closure without the `def` keyword (or other type qualifier). Using the `def` keyword makes the variable local to the enclosing scope; omitting the `def` keyword makes the variable global to the entire script.
138
+
In the above example, `X` is declared in each `map` closure. Without the `def` keyword, the variable `X` is global to the entire script. Because operators are executed concurrently and `X` is global, there is a *race condition* that causes the emitted values to vary depending on the order of the concurrent operations. If these values were passed to a process as inputs, the process would execute different tasks during each run due to the race condition.
126
139
127
-
Because `X` is global, and operators are executed concurrently, there is a *race condition* on `X`, which means that the emitted values will vary depending on the particular order of the concurrent operations. If the values were passed as inputs into a process, the process would execute different tasks on each run due to the race condition.
128
-
129
-
The solution is to not use a global variable where a local variable is enough (or in this simple example, avoid the variable altogether):
140
+
To resolve this issue, avoid declaring global variables in closures:
130
141
131
142
```nextflow
132
-
// local variable
133
-
channel.of(1,2,3) | map { v -> def X=v; X+=2 } | view { v -> "ch1 = $v" }
134
-
135
-
// no variable
136
-
channel.of(1,2,3) | map { v -> v * 2 } | view { v -> "ch2 = $v" }
143
+
channel.of(1,2,3).map { v -> def X=v; X+=2 }.view { v -> "ch1 = $v" }
137
144
```
138
145
146
+
:::{versionadded} 25.04.0
147
+
The {ref}`strict syntax <strict-syntax-page>` does not allow global variables to be declared in closures.
148
+
:::
149
+
139
150
(cache-nondeterministic-inputs)=
140
151
141
152
### Non-deterministic process inputs
142
153
143
-
Sometimes a process needs to merge inputs from different sources. Consider the following example:
154
+
A process that merges inputs from different sources non-deterministically may invalidate the cache. For example:
144
155
145
156
```nextflow
146
157
workflow {
@@ -161,9 +172,9 @@ process check_bam_bai {
161
172
}
162
173
```
163
174
164
-
It is tempting to assume that the process inputs will be matched by `id` like the {ref}`operator-join` operator. But in reality, they are simply merged like the {ref}`operator-merge` operator. As a result, not only will the process inputs be incorrect, they will also be non-deterministic, thus invalidating the cache.
175
+
In the above example, the inputs will be merged without matching on `id`, in a similar manner as the {ref}`operator-merge` operator. As a result, the inputs are incorrect and non-deterministic.
165
176
166
-
The solution is to explicitly join the two channels before the process invocation:
177
+
To resolve this issue, use the `join` operator to join the channels into a single input channel before invoking the process:
Copy file name to clipboardExpand all lines: docs/developer-env.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -96,9 +96,9 @@ See {ref}`vscode-page` for more information about the Nextflow extension feature
96
96
97
97
**nf-core**
98
98
99
-
The [nf-core extension pack](https://marketplace.visualstudio.com/items?itemName=nf-core.nf-core-extensionpack) adds a selection of tools that help develop with nf-core, a community effort to collect a curated set of analysis pipelines built using Nextflow.
99
+
[nf-core](https://nf-co.re/) is a community effort to collect a curated set of analysis pipelines built using Nextflow. The [nf-core extension pack](https://marketplace.visualstudio.com/items?itemName=nf-core.nf-core-extensionpack) adds a selection of tools that support development. For example, it includes [Code Spell Checker](https://marketplace.visualstudio.com/items?itemName=streetsidesoftware.code-spell-checker), [Prettier](https://marketplace.visualstudio.com/items?itemName=esbenp.prettier-vscode), [Todo Tree](https://marketplace.visualstudio.com/items?itemName=Gruntfuggly.todo-tree), and [Markdown Extended](https://marketplace.visualstudio.com/items?itemName=jebbs.markdown-extended).
100
100
101
-
The nf-core extension pack includes several useful extensions. For example, [Code Spell Checker](https://marketplace.visualstudio.com/items?itemName=streetsidesoftware.code-spell-checker), [Prettier](https://marketplace.visualstudio.com/items?itemName=esbenp.prettier-vscode), [Todo Tree](https://marketplace.visualstudio.com/items?itemName=Gruntfuggly.todo-tree), and [Markdown Extended](https://marketplace.visualstudio.com/items?itemName=jebbs.markdown-extended). See [nf-core extension pack](https://marketplace.visualstudio.com/items?itemName=nf-core.nf-core-extensionpack) for more information about the tools included in the nf-core extension pack.
101
+
See the [nf-core extension pack](https://marketplace.visualstudio.com/items?itemName=nf-core.nf-core-extensionpack) for more information about the included tools.
Copy file name to clipboardExpand all lines: docs/guides/aws-java-sdk-v2.md
+8-8Lines changed: 8 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,13 +2,13 @@
2
2
3
3
# AWS Java SDK v2
4
4
5
-
AWS Java SDK v1 is reaching end of life at the end of 2025. Starting in version `25.06.0-edge`, Nextflow uses AWS Java SDK v2 in the `nf-amazon` plugin.
5
+
AWS Java SDK v1 will reach end of life at the end of 2025. Starting with version `25.06.0-edge`, Nextflow uses AWS Java SDK v2 in the `nf-amazon` plugin.
6
6
7
-
This migration introduced several breaking changes to the `aws.client` config scope, including new options and removed options. This page describes these changes and how they affect your Nextflow configuraiton.
7
+
This migration introduces several breaking changes to the `aws.client` config scope, including new and removed options. This page describes these changes and how they affect your Nextflow configuration.
8
8
9
9
## New HTTP client
10
10
11
-
The HTTP client used by SDK v2 does not support overriding certain advanced HTTP options. As a result, the following config options are no longer supported:
11
+
The HTTP client in SDK v2 does not support overriding certain advanced HTTP options. As a result, the following config options are no longer supported:
12
12
13
13
-`aws.client.protocol`
14
14
-`aws.client.signerOverride`
@@ -18,15 +18,15 @@ The HTTP client used by SDK v2 does not support overriding certain advanced HTTP
18
18
19
19
## S3 transfer manager
20
20
21
-
The *S3 transfer manager* is a subsystem of SDK v2 which handles S3 transfers, including S3 uploads and downloads.
21
+
The *S3 transfer manager* is a subsystem of SDK v2 that handles S3 uploads and downloads.
22
22
23
-
The concurrency and throughput of the S3 transfer manager can be configured manually using the `aws.client.maxConcurrency` and `aws.client.maxNativeMemory`config options. Alternatively, the `aws.client.targetThroughputInGbps`config option can be used to set the previous two options automatically based on a target throughput.
23
+
You can configure the concurrency and throughput of the S3 transfer manager manually using the `aws.client.maxConcurrency` and `aws.client.maxNativeMemory`configuration options. Alternatively, you can use the `aws.client.targetThroughputInGbps` option to set both values automatically based on a target throughput.
24
24
25
-
## Multi-part uplaods
25
+
## Multi-part uploads
26
26
27
-
Multi-part uploads are handled by the S3 transfer manager. The `aws.client.minimumPartSize` and `aws.client.multipartThreshold` config options can be used to control when and how multi-part uploads are performed.
27
+
Multi-part uploads are handled by the S3 transfer manager. You can use the `aws.client.minimumPartSize` and `aws.client.multipartThreshold` config options to control when and how multi-part uploads are performed.
28
28
29
-
The following multi-part upload options are no longer supported:
29
+
The following multi-part upload config options are no longer supported:
Copy file name to clipboardExpand all lines: docs/migrations/24-10.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -35,7 +35,7 @@ Nextflow now supports managed identities for the Azure Batch executor. See {ref}
35
35
36
36
<h3>Task previous execution trace</h3>
37
37
38
-
The `task` variable in the process definition has two new proprties, `task.previousTrace` and `task.previousException`, which allows a task to access the runtime metadata of the previous attempt. See {ref}`task-previous-execution-trace` for details.
38
+
The `task` variable in the process definition has two new properties, `task.previousTrace` and `task.previousException`, which allows a task to access the runtime metadata of the previous attempt. See {ref}`task-previous-execution-trace` for details.
39
39
40
40
## Breaking changes
41
41
@@ -53,7 +53,7 @@ The `task` variable in the process definition has two new proprties, `task.previ
53
53
54
54
- The use of `addParams` and `params` clauses in include declarations is deprecated. See {ref}`module-params` for details.
55
55
56
-
## Miscellanous
56
+
## Miscellaneous
57
57
58
58
- New config option: `aws.client.requesterPays`
59
59
- New config option: `google.batch.autoRetryExitCodes`
Copy file name to clipboardExpand all lines: docs/migrations/25-04.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -30,7 +30,7 @@ The third preview of workflow outputs introduces the following breaking changes
30
30
31
31
- The syntax for dynamic publish paths has changed. Instead of defining a closure that returns a closure with the `path` directive, the outer closure should use the `>>` operator to publish individual files. See {ref}`workflow-publishing-files` for details.
32
32
33
-
- The `mapper` index directive has been removed. Use a `map` operator in the workflwo body instead.
33
+
- The `mapper` index directive has been removed. Use a `map` operator in the workflow body instead.
34
34
35
35
See {ref}`migrating-workflow-outputs` to get started.
0 commit comments