-
Notifications
You must be signed in to change notification settings - Fork 5
Add RPC Retries via interceptor #24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
cadence/_internal/rpc/retry.py
Outdated
|
|
||
| def is_retryable(err: CadenceError, call_details: ClientCallDetails) -> bool: | ||
| # Handle requests to the passive side, matching the Go and Java Clients | ||
| if call_details.method == b'/uber.cadence.api.v1.WorkflowAPI/GetWorkflowExecutionHistory' and isinstance(err, EntityNotExistsError): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tbh not sure I understand what would trigger this besides replication lag races?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that's exactly what it's for. If you Start a Workflow and want to poll until it completes you have to do this.
cadence/_internal/rpc/retry.py
Outdated
| except CadenceError as e: | ||
| err = e | ||
| attempts = 1 | ||
| while is_retryable(err, client_call_details): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: simplify using do while loop by using break/continue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, good call
cadence/_internal/rpc/retry.py
Outdated
|
|
||
| def is_retryable(err: CadenceError, call_details: ClientCallDetails) -> bool: | ||
| # Handle requests to the passive side, matching the Go and Java Clients | ||
| if call_details.method == b'/uber.cadence.api.v1.WorkflowAPI/GetWorkflowExecutionHistory' and isinstance(err, EntityNotExistsError): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to use DESCRIPTOR but that's too complicated. Maybe at least add a variable to replace the string here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Add RPC retries via an interceptor, with exponential backoff matching both the Go and Java clients. The approach here differs from both however in two main ways: 1. Adding these via interceptor makes it implicit, while both the Go and Java client require it to be explicit. 2. The specific requests to retry are based on GRPC error codes, rather than explicitly listing non-retryable errors and retrying everything by default. This seems like a more sustainable approach, since nearly every error type is non-retryable. A newly introduced error type would require a client update to mark it non-retryable before it could safely be used. Any time the python client doesn't recognize an error it gets mapped to just CadenceError, so new errors can safely be added.
Add RPC retries via an interceptor, with exponential backoff matching both the Go and Java clients. The approach here differs from both however in two main ways:
What changed?
Why?
How did you test it?
Potential risks
Release notes
Documentation Changes