Skip to content

Commit e5189bf

Browse files
authored
Add retries to WaitForInstanceCompletion (#412)
For certain backends (like DTS), there can be a hard limit of 4 minutes for idle requests. As a WaitForInstanceCompletion sits idle while waiting, it can receive a 504. This change allows for the SDK to retry instead of returning that error to the caller. Signed-off-by: halspang <[email protected]>
1 parent 77af874 commit e5189bf

File tree

2 files changed

+26
-10
lines changed

2 files changed

+26
-10
lines changed

CHANGELOG.md

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,15 @@
11
# Changelog
22

3+
## (Unreleased)
4+
5+
- Add automatic retry on gateway timeout in `GrpcDurableTaskClient.WaitForInstanceCompletionAsync` in [#412](https://github.com/microsoft/durabletask-dotnet/pull/412))
6+
37
## v1.10.0
4-
- Update DurableTask.Core to v3.1.0 and Bump version to v1.10.0 by @nytian in([#411](https://github.com/microsoft/durabletask-dotnet/pull/411))
8+
9+
- Update DurableTask.Core to v3.1.0 and Bump version to v1.10.0 by @nytian in ([#411](https://github.com/microsoft/durabletask-dotnet/pull/411))
510

611
## v1.9.1
12+
713
- Add basic orchestration and activity execution logs by @cgillum in ([#405](https://github.com/microsoft/durabletask-dotnet/pull/405))
814
- Add default version in `TaskOrchestrationContext` by @halspang in ([#408](https://github.com/microsoft/durabletask-dotnet/pull/408))
915

src/Client/Grpc/GrpcDurableTaskClient.cs

Lines changed: 19 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -348,17 +348,27 @@ public override async Task<OrchestrationMetadata> WaitForInstanceCompletionAsync
348348
GetInputsAndOutputs = getInputsAndOutputs,
349349
};
350350

351-
try
351+
while (!cancellation.IsCancellationRequested)
352352
{
353-
P.GetInstanceResponse response = await this.sidecarClient.WaitForInstanceCompletionAsync(
354-
request, cancellationToken: cancellation);
355-
return this.CreateMetadata(response.OrchestrationState, getInputsAndOutputs);
356-
}
357-
catch (RpcException e) when (e.StatusCode == StatusCode.Cancelled)
358-
{
359-
throw new OperationCanceledException(
360-
$"The {nameof(this.WaitForInstanceCompletionAsync)} operation was canceled.", e, cancellation);
353+
try
354+
{
355+
P.GetInstanceResponse response = await this.sidecarClient.WaitForInstanceCompletionAsync(
356+
request, cancellationToken: cancellation);
357+
return this.CreateMetadata(response.OrchestrationState, getInputsAndOutputs);
358+
}
359+
catch (RpcException e) when (e.StatusCode == StatusCode.Cancelled)
360+
{
361+
throw new OperationCanceledException(
362+
$"The {nameof(this.WaitForInstanceCompletionAsync)} operation was canceled.", e, cancellation);
363+
}
364+
catch (RpcException e) when (e.StatusCode == StatusCode.DeadlineExceeded)
365+
{
366+
// Gateway timeout/deadline exceeded can happen before the request is completed. Do nothing and retry.
367+
}
361368
}
369+
370+
// If the operation was cancelled in between requests, we should still throw instead of returning a null value.
371+
throw new OperationCanceledException($"The {nameof(this.WaitForInstanceCompletionAsync)} operation was canceled.");
362372
}
363373

364374
/// <inheritdoc/>

0 commit comments

Comments
 (0)