3
3
Workflow Completion
4
4
===================
5
5
6
- A workflow can :term: `shut down <shutdown> ` once all
7
- :term: `active tasks <active task> ` complete without spawning further
8
- downstream activity - i.e., when :term: `n=0 window <n-window> ` empties out.
6
+ Once Cylc has run all of the tasks in the :term: `graph ` (i.e. once it has
7
+ reached the end of the workflow and there are no tasks left), the workflow
8
+ will shut down automatically.
9
+
10
+ A workflow with no tasks left is said to have "completed".
11
+
12
+ When you restart a workflow, it will restart in the same state it shut down in.
13
+ So if you restart a completed workflow (one with no remaining tasks), it will
14
+ come back with no tasks. Having no more tasks to run, the workflow will
15
+ automatically shut down after the configured
16
+ :cylc:conf: `restart timeout <[scheduler][events]restart timeout> `.
17
+
18
+ If you want to re-run some tasks in a completed workflow, restart the workflow
19
+ then
20
+ :ref: `re-trigger the selected tasks <interventions.re-run-multiple-tasks >`
21
+ or :ref: `trigger a new flow <interventions.reflow >` to run through the graph
22
+ (before the restart timeout passes).
23
+
24
+ A common pattern is to restart a completed workflow and extend it for a few
25
+ cycles. The easiest way to achieve this is to use the
26
+ :cylc:conf: `stop after cycle point <[scheduling]stop after cycle point> `
27
+ rather than the
28
+ :cylc:conf: `final cycle point <[scheduling]final cycle point> `, this prevents
29
+ the workflow from completing, making it easier to restart it from where it
30
+ left off. For a worked example, see :ref: `examples.extending-workflow `.
31
+
9
32
10
33
.. _scheduler stall :
11
34
12
35
Scheduler Stall
13
36
===============
14
37
15
- A workflow has stalled if:
38
+ If Cylc is unable to make progress through the :term: `graph ` (i.e, if the path
39
+ through the graph is "blocked"), then the workflow is considered
40
+ :term: `stalled <stall> `.
16
41
17
- * No tasks are waiting on unsatisfied external events, like clock triggers and xtriggers.
18
- * AND All activity has ceased.
19
- * AND The workflow has not run to completion.
42
+ Stalls are usually caused by unexpected task failures.
20
43
21
- A workflow which has stalled requires manual intervention to continue.
44
+ A stalled workflow has not run to completion but cannot continue without manual
45
+ intervention. Typically this involves
46
+ :ref: `fixing and rerunning a failed task <interventions.edit-a-tasks-configuration >`.
47
+
48
+
49
+ Stall Conditions
50
+ ----------------
51
+
52
+ A workflow has stalled if:
53
+
54
+ * The workflow has not run to completion (i.e, there are still tasks left
55
+ for Cylc to run).
56
+ * AND no tasks are waiting on unsatisfied
57
+ :ref: `external events <Section External Triggers >` (e.g, clock triggers
58
+ and xtriggers).
59
+ * AND All activity has ceased (i.e, no preparing, submitted or running tasks).
22
60
23
61
Stalls are caused by :term: `final status incomplete tasks <output completion> `
24
62
and :term: `partially satisfied tasks <prerequisite> `.
25
63
26
64
These most often result from task failures that the workflow does not
27
- handle automatically by retries or optional branching.
65
+ handle automatically by :term: `retries <retry> ` or :term: `graph branching `.
66
+
67
+
68
+ Diagnosing Stalls
69
+ -----------------
70
+
71
+ A screenshot of the Cylc GUI displaying a stalled workflow:
72
+
73
+ .. image :: ../../img/gui-stall.png
74
+ :align: center
75
+ :width: 90%
76
+
77
+ |
78
+
79
+ In the above screenshot:
80
+
81
+ * The stall was caused by the failure of the task ``2/a ``.
82
+ * The stall event is recorded in the :term: `workflow log ` file (shown on the
83
+ right) along with the list of :term: `incomplete tasks <output completion> `
84
+ that caused it (2/a did not complete the required outputs: succeeded).
85
+ * In the GUI, the :ref: `warning triangle <changes.warning_triangles >`
86
+ will light up to notify you of the error, hover over it to see the log
87
+ messages.
88
+
89
+
90
+ Stall Timeouts
91
+ --------------
28
92
29
93
A stalled scheduler stays alive for a configurable timeout period
30
94
to allow you to intervene, e.g. by manually triggering an incomplete
@@ -34,12 +98,28 @@ If a stalled workflow does eventually shut down, on the stall timeout
34
98
or by stop command, it will immediately stall again on restart to await
35
99
manual intervention.
36
100
37
- .. warning ::
101
+ Stall timeout behaviour is controlled by the following configurations:
102
+
103
+ .. admonition :: Configuration
104
+ :class: note
105
+
106
+ :cylc:conf: `[scheduler][events]stall timeout `
107
+ The length of time before a stalled workflow will shut down.
108
+ :cylc:conf: `[scheduler][events]abort on stall timeout `
109
+ Whether the scheduler should shut down immediately with error status if
110
+ the stall timeout is reached.
111
+
112
+
113
+ Stall Events
114
+ ------------
38
115
39
- Look in the :term: `scheduler log ` to see which tasks caused a stall.
116
+ Cylc emits the :ref: `stall <user_guide.workflow_events.stall >` event when a
117
+ scheduler stalls.
40
118
41
- .. seealso ::
119
+ .. admonition :: Configuration
120
+ :class: note
42
121
43
- * :cylc:conf: `[scheduler][events]stall timeout `
44
- * :cylc:conf: `[scheduler][events]abort on stall timeout `
45
- * :cylc:conf: `[scheduler][events]stall handlers `
122
+ :cylc:conf: `[scheduler][events]mail events = stall `
123
+ Configure emails for stall events.
124
+ :cylc:conf: `[scheduler][events]stall handlers `
125
+ Configure custom event handlers to run on stall events.
0 commit comments