You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/cache-and-resume.md
+40-31Lines changed: 40 additions & 31 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -72,75 +72,84 @@ For this reason, it is important to preserve both the task cache (`.nextflow/cac
72
72
73
73
## Troubleshooting
74
74
75
-
Cache failures happen when either (1) a task that was supposed to be cached was re-executed, or (2) a task that was supposed to be re-executed was cached.
75
+
Cache failures occur when a task that was supposed to be cached was re-executed or a task that was supposed to be re-executed was cached.
76
76
77
-
When this happens, consider the following questions:
77
+
Common causes of cache failures include:
78
78
79
-
- Is resume enabled via `-resume`?
80
-
- Is the {ref}`process-cache` directive set to a non-default value?
81
-
- Is the task still present in the task cache and work directory?
-[Race condition on a global variable](#race-condition-on-a-global-variable)
84
+
-[Non-deterministic process inputs](#non-deterministic-process-inputs)
83
85
84
-
Changing any of the inputs included in the [task hash](#task-hash) will invalidate the cache, for example:
86
+
### Resume not enabled
85
87
86
-
- Resuming from a different session ID
87
-
- Changing the process name
88
-
- Changing the task container image or Conda environment
89
-
- Changing the task script
90
-
- Changing an input file or bundled script used by the task
88
+
The `-resume` option is required to resume a pipeline. Ensure you enable `-resume` in your run command or your Nextflow configuration file.
91
89
92
-
While the following examples would not invalidate the cache:
90
+
### Non-default cache directives
93
91
94
-
- Changing the value of a directive (other than {ref}`process-ext`), even if that directive is used in the task script
92
+
The `cache` directive is enabled by default. However, you can disable or modify its behavior for a specific process. For example:
95
93
96
-
In many cases, cache failures happen because of a change to the pipeline script or configuration, or because the pipeline itself has some non-deterministic behavior.
94
+
```nextflow
95
+
process FOO {
96
+
cache false
97
+
// ...
98
+
}
99
+
```
97
100
98
-
Here are some common reasons for cache failures:
101
+
Ensure that the cache has not been set to a non-default value. See {ref}`process-cache`for more information about the `cache` directive.
99
102
100
103
### Modified input files
101
104
102
-
Make sure that your input files have not been changed. Keep in mind that the default caching mode uses the complete file path, the last modified timestamp, and the file size. If any of these attributes change, the task will be re-executed, even if the file content is unchanged.
105
+
Modifying inputs that are used in the task hash invalidates the cache. Common causes of modified inputs include:
103
106
104
-
### Process that modifies its inputs
107
+
- Changing input files
108
+
- Resuming from a different session ID
109
+
- Changing the process name
110
+
- Changing the task container image or Conda environment
111
+
- Changing the task script
112
+
- Changing a bundled script used by the task
105
113
106
-
If a process modifies its own input files, it cannot be resumed for the reasons described in the previous point. As a result, processes that modify their own input files are considered an anti-pattern and should be avoided.
114
+
Nextflow calculates a hash for an input file using its full path, last modified timestamp, and file size. If any of these attributes change, Nextflow re-executes the task. If a process modifies its input files, you cannot resume it. Avoid processes that modify their own input files as this is considered an anti-pattern.
107
115
108
116
### Inconsistent file attributes
109
117
110
-
Some shared file systems, such as NFS, may report inconsistent file timestamps, which can invalidate the cache. If you encounter this problem, you can avoid it by using the `'lenient'` {ref}`caching mode <process-cache>`, which ignores the last modified timestamp and uses only the file path and size.
118
+
Some shared file systems, such as NFS, may report inconsistent file timestamps.
119
+
120
+
To resolve this issue, use the `'lenient'` {ref}`caching mode <process-cache>` to ignore the last modified timestamp and use only the file path.
111
121
112
122
(cache-global-var-race-condition)=
113
123
114
124
### Race condition on a global variable
115
125
116
-
While Nextflow tries to make it easy to write safe concurrent code, it is still possible to create race conditions, which can in turn impact the caching behavior of your pipeline.
117
-
118
-
Consider the following example:
126
+
Race conditions can in disrupt caching behavior of your pipeline. For example:
119
127
120
128
```nextflow
121
129
channel.of(1,2,3) | map { v -> X=v; X+=2 } | view { v -> "ch1 = $v" }
122
130
channel.of(1,2,3) | map { v -> X=v; X*=2 } | view { v -> "ch2 = $v" }
123
131
```
124
132
125
-
The problem here is that `X` is declared in each `map` closure without the `def` keyword (or other type qualifier). Using the `def` keyword makes the variable local to the enclosing scope; omitting the `def` keyword makes the variable global to the entire script.
133
+
In the above example, `X` is declared in each `map` closure. Without the `def` keyword, or other type qualifier, the variable `X` is global to the entire script. Operators and executed concurrently and, as `X` is global, there is a *race condition* that causes the emitted values to vary depending on the order of the concurrent operations. If these values were passed to a process as inputs the process would execute different tasks during each run due to the race condition.
126
134
127
-
Because `X` is global, and operators are executed concurrently, there is a *race condition* on `X`, which means that the emitted values will vary depending on the particular order of the concurrent operations. If the values were passed as inputs into a process, the process would execute different tasks on each run due to the race condition.
128
-
129
-
The solution is to not use a global variable where a local variable is enough (or in this simple example, avoid the variable altogether):
135
+
To resolve this failure type, use local variables:
130
136
131
137
```nextflow
132
138
// local variable
133
139
channel.of(1,2,3) | map { v -> def X=v; X+=2 } | view { v -> "ch1 = $v" }
140
+
```
134
141
135
-
// no variable
142
+
Alternatively, remove the variable:
143
+
144
+
```nextflow
136
145
channel.of(1,2,3) | map { v -> v * 2 } | view { v -> "ch2 = $v" }
137
146
```
138
147
139
148
(cache-nondeterministic-inputs)=
140
149
141
150
### Non-deterministic process inputs
142
151
143
-
Sometimes a process needs to merge inputs from different sources. Consider the following example:
152
+
A process that merges inputs from different sources non-deterministically may invalidate the cache. For example:
144
153
145
154
```nextflow
146
155
workflow {
@@ -161,9 +170,9 @@ process check_bam_bai {
161
170
}
162
171
```
163
172
164
-
It is tempting to assume that the process inputs will be matched by `id` like the {ref}`operator-join` operator. But in reality, they are simply merged like the {ref}`operator-merge` operator. As a result, not only will the process inputs be incorrect, they will also be non-deterministic, thus invalidating the cache.
173
+
In the above example, the inputs will merge without matching. This is the same way method used by the {ref}`operator-merge` operator. When merged, the inputs are incorrect, non-deterministic, and invalidate the cache.
165
174
166
-
The solution is to explicitly join the two channels before the process invocation:
175
+
To resolve this failure type, join channels before invoking the process:
0 commit comments