@@ -139,8 +139,92 @@ Time related functions:
139
139
140
140
Failing a Workflow
141
141
142
- To mark a workflow as failed all that needs to happen is for the workflow function to return an error via the err
143
- return value.
142
+ To mark a workflow as failed, return an error from your workflow function via the err return value.
143
+ Note that failed workflows do not record the non-error return's value: you cannot usefully return both a
144
+ value and an error, only the error will be recorded.
145
+
146
+ Ending a Workflow externally
147
+
148
+ Inside a workflow, to end you must finish your function by returning a result or error.
149
+
150
+ Externally, two tools exist to stop workflows from outside the workflow itself, by using the CLI or RPC client:
151
+ cancellation and termination. Termination is forceful, cancellation allows a workflow to exit gracefully.
152
+
153
+ Workflows can also time out, based on their ExecutionStartToClose duration. A timeout behaves the same as
154
+ termination (it is a hard deadline on the workflow), but a different close status and final event will be reported.
155
+
156
+ Terminating a Workflow
157
+
158
+ Terminating is roughly equivalent to using `kill -9` on a process - the workflow will be ended immediately,
159
+ and no further decisions will be made. It cannot be prevented or delayed by the workflow, or by any configuration.
160
+ Any in-progress decisions or activities will fail whenever they next communicate with Cadence's servers, i.e. when
161
+ they complete or when they next heartbeat.
162
+
163
+ Because termination does not allow for any further code to be run, this also means your workflow has no
164
+ chance to clean up after itself (e.g. running a cleanup Activity to adjust a database record).
165
+ If you need to run additional logic when your workflow, use cancellation instead.
166
+
167
+ Canceling a Workflow
168
+
169
+ Canceling marks a workflow as canceled (this is a one-time, one-way operation), and immediately wakes the workflow
170
+ up to process the cancellation (schedules a new decision task). When the workflow resumes after being canceled,
171
+ the context that was passed into the workflow (and thus all derived contexts) will be canceled, which changes the
172
+ behavior of many workflow.* functions.
173
+
174
+ Canceled workflow.Context behavior
175
+
176
+ A workflow's context can be canceled by either canceling the workflow, or calling the cancel-func returned from
177
+ a worfklow.WithCancel(ctx) call. Both behave identically.
178
+
179
+ At any time, you can convert a canceled (or could-be-canceled) context into a non-canceled context by using
180
+ workflow.NewDisconnectedContext. The resulting context will ignore cancellation from the context it is derived from.
181
+ Disconnected contexts like this can be created before or after a context has been canceled, and it does not matter
182
+ how the cancellation occurred.
183
+ Because this context will not be canceled, this can be useful for using context cancellation as a way to request that
184
+ some behavior be shut down, while allowing you to run cleanup logic in activities or elsewhere.
185
+
186
+ As a general guideline, doing anything with I/O with a canceled context (e.g. executing an activity, starting a
187
+ child workflow, sleeping) will fail rather than cause external changes. Detailed descriptions are available in
188
+ documentation on functions that change their behavior with a canceled context; if it does not mention canceled-context
189
+ behavior, its behavior does not change.
190
+ For exact behavior, make sure to read the documentation on functions that you are calling.
191
+
192
+ As an incomplete summary, these actions will all fail immediately, and the associated error returns (possibly within
193
+ a Future) will be a workflow.CanceledError:
194
+
195
+ - workflow.Await
196
+ - workflow.Sleep
197
+ - workflow.Timer
198
+
199
+ Child workflows will:
200
+
201
+ - ExecuteChildWorkflow will synchronously fail with a CanceledError if canceled before it is called
202
+ (in v0.18.4 and newer. See https://github.com/uber-go/cadence-client/pull/1138 for details.)
203
+ - be canceled if the child workflow is running
204
+ - wait to complete their future.Get until the child returns, and the future will contain the final result
205
+ (which may be anything that was returned, not necessarily a CanceledError)
206
+
207
+ Activities have configurable cancellation behavior. For workflow.ExecuteActivity and workflow.ExecuteLocalActivity,
208
+ see the activity package's documentation for details. In summary though:
209
+
210
+ - ExecuteActivity will synchronously fail with a CanceledError if canceled before it is called
211
+ - the activity's future.Get will by default return a CanceledError immediately when canceled,
212
+ unless activityoptions.WaitForCancellation is true
213
+ - the activity's context will be canceled at the next heartbeat event, or not at all if that does not occur
214
+
215
+ And actions like this will be completely unaffected:
216
+
217
+ - future.Get
218
+ (futures derived from the calls above may return a CanceledError, but this is not guaranteed for all futures)
219
+ - selector.Select
220
+ (Select is completely unaffected, similar to a native select statement. if you wish to unblock when your
221
+ context is canceled, consider using an AddReceive with the context's Done() channel, as with a native select)
222
+ - channel.Send, channel.Receive, and channel.ReceiveAsync
223
+ (similar to native chan read/write operations, use a selector to wait for send/receive or some other action)
224
+ - workflow.Go
225
+ (the context argument in the callback is derived and may be canceled, but this does not stop the goroutine,
226
+ nor stop new ones from being started)
227
+ - workflow.GetVersion, workflow.GetLogger, workflow.GetMetricsScope, workflow.Now, many others
144
228
145
229
Execute Activity
146
230
@@ -286,14 +370,14 @@ pattern, extra care needs to be taken to ensure the child workflow is started be
286
370
Error Handling
287
371
288
372
Activities and child workflows can fail. You could handle errors differently based on different error cases. If the
289
- activity returns an error as errors.New() or fmt.Errorf(), those errors will be converted to error .GenericError. If the
290
- activity returns an error as error .NewCustomError("err-reason", details), that error will be converted to
291
- *error .CustomError. There are other types of errors like error .TimeoutError, error .CanceledError and error .PanicError.
373
+ activity returns an error as errors.New() or fmt.Errorf(), those errors will be converted to workflow .GenericError. If the
374
+ activity returns an error as workflow .NewCustomError("err-reason", details), that error will be converted to
375
+ *workflow .CustomError. There are other types of errors like workflow .TimeoutError, workflow .CanceledError and workflow .PanicError.
292
376
So the error handling code would look like:
293
377
294
378
err := workflow.ExecuteActivity(ctx, YourActivityFunc).Get(ctx, nil)
295
379
switch err := err.(type) {
296
- case *error .CustomError:
380
+ case *workflow .CustomError:
297
381
switch err.Reason() {
298
382
case "err-reason-a":
299
383
// handle error-reason-a
@@ -305,7 +389,7 @@ So the error handling code would look like:
305
389
default:
306
390
// handle all other error reasons
307
391
}
308
- case *error .GenericError:
392
+ case *workflow .GenericError:
309
393
switch err.Error() {
310
394
case "err-msg-1":
311
395
// handle error with message "err-msg-1"
@@ -314,7 +398,7 @@ So the error handling code would look like:
314
398
default:
315
399
// handle all other generic errors
316
400
}
317
- case *error .TimeoutError:
401
+ case *workflow .TimeoutError:
318
402
switch err.TimeoutType() {
319
403
case shared.TimeoutTypeScheduleToStart:
320
404
// handle ScheduleToStart timeout
@@ -324,9 +408,9 @@ So the error handling code would look like:
324
408
// handle heartbeat timeout
325
409
default:
326
410
}
327
- case *error .PanicError:
328
- // handle panic error
329
- case *error .CanceledError:
411
+ case *workflow .PanicError:
412
+ // handle panic error
413
+ case *workflow .CanceledError:
330
414
// handle canceled error
331
415
default:
332
416
// all other cases (ideally, this should not happen)
@@ -530,7 +614,7 @@ The code below implements the unit tests for the SimpleWorkflow sample.
530
614
s.True(s.env.IsWorkflowCompleted())
531
615
532
616
s.NotNil(s.env.GetWorkflowError())
533
- _, ok := s.env.GetWorkflowError().(*error .GenericError)
617
+ _, ok := s.env.GetWorkflowError().(*workflow .GenericError)
534
618
s.True(ok)
535
619
s.Equal("SimpleActivityFailure", s.env.GetWorkflowError().Error())
536
620
}
@@ -591,7 +675,7 @@ Lets first take a look at a test that simulates a test failing via the "activity
591
675
s.True(s.env.IsWorkflowCompleted())
592
676
593
677
s.NotNil(s.env.GetWorkflowError())
594
- _, ok := s.env.GetWorkflowError().(*error .GenericError)
678
+ _, ok := s.env.GetWorkflowError().(*workflow .GenericError)
595
679
s.True(ok)
596
680
s.Equal("SimpleActivityFailure", s.env.GetWorkflowError().Error())
597
681
}
0 commit comments