-
Notifications
You must be signed in to change notification settings - Fork 138
pager: add client-side timeout logic #1458
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR implements client-side timeout logic for paged queries to address issue #1418. The implementation adds timeout handling in the PagerWorker abstraction to ensure that queries respect client-side timeouts during page fetching operations.
Key Changes:
- Introduces
PageQueryTimeouterabstraction for clean encapsulation of timeout logic in paged queries - Adds timeout checking for both initial and subsequent page fetches with proper reset between successful pages
- Implements comprehensive integration tests using proxy-based delay injection to validate timeout behavior
Reviewed Changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| scylla/src/client/pager.rs | Implements PageQueryTimeouter abstraction and integrates timeout logic into PagerWorker for page fetch operations |
| scylla/src/client/session.rs | Adds timeout logging and renames variable for consistency |
| scylla/src/client/execution_profile.rs | Updates documentation to clarify timeout behavior |
| scylla/tests/integration/statements/request_timeout.rs | Expands timeout tests with proxy-based delays and additional test cases for batch operations |
| scylla/tests/integration/session/pager.rs | Adds comprehensive pager timeout test covering first page, subsequent page, and retry timeout scenarios |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
47a8199 to
b576cdc
Compare
|
Rebased on main. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. One thing I'm don't like too much is nested Result. Maybe it would be clearer if you introduced a new (private) error type which would be either RequestAttemptError or timeout error, and convertible to RequestError? wdyt?
Nooo, this would involve even more boilerplate, and conversions between error types' variants is yet another spot for bugs. As long as we're planning to unify pager and session execution logic, I prefer to not overcomplicate the present state. |
For now, the implementation is quite raw and uses no encapsulation. I decided to keep this commit mainly to show the first approach that I had. The next commit will encapsulate the timeout logic into its own abstraction and make the code cleaner. If reviewers think that it would be cleaner to squash this commit with the next one, I can do that. The timeout is per-page. This means that the clock starts ticking when we begin the first attempt to fetch a page, and subsequent retries of fetching the same page will still be subject to the same timeout. This is to avoid extending the timeout indefinitely with retries, and is in line with how Session's general request timeout works. Once a page is fetched and another one is requested, the timeout is reset for the new page.
PageQueryTimeouter is introduced to encapsulate timeout logic for paging queries. It was clear to me that such encapsulation in a new type improves code readability and maintainability, as the timeout logic is now clearly separated from the rest of the paging logic.
As QueryPager has now reached feature parity with other execution methods wrt timeouts, we need to test that timeouts are properly enforced when fetching pages. The test uses scylla-proxy to simulate delayed responses from the server, causing timeouts to occur in various scenarios: - the first page fetch times out (manifests in `execute_iter()`); - the second page fetch times out (manifests in `rows_stream().next()`); - retries cause cumulative delay exceeding the timeout, leading to timeout.
This makes the next commit cleaner.
The SingleConnectionPagerWorker now supports request timeouts when fetching pages. This is for consistency with the main PagerWorker implementation. Note that for now, no metadata queries specify timeouts, so the default per-Session execution profile's timeout will be used. Nothing prevents us from specifying custom timeouts for metadata queries in the future, thanks to this commit.
b576cdc to
c4ee9bf
Compare
|
@Lorak-mmk I implemented timeouts to the SingleConnectionPagerWorker as well. Tests there as missing, though. I hope you agree that the logic there is clear enough not to require tests. |
| // Case 1: the first page fetch times out. | ||
| { | ||
| let timeout = Duration::from_secs(1); | ||
| prepared.set_request_timeout(Some(timeout)); | ||
|
|
||
| running_proxy.running_nodes.iter_mut().for_each(|node| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this timeout so big?
| // Case 2: the second page fetch times out. | ||
| { | ||
| let timeout = Duration::from_secs(1); | ||
| prepared.set_request_timeout(Some(timeout)); | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
| // Here, each retry will be delayed by 200ms. | ||
| // With a 500ms timeout, this means that after 3 retries (600ms total delay), | ||
| // the timeout will be exceeded. | ||
| let per_retry_delay = Duration::from_millis(200); | ||
| let timeout = Duration::from_millis(500); | ||
|
|
||
| // Set timeout through the execution profile. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
|
I was actually meaning to request changes, not approve - I think its important to not make tests slower than they need to be. |
Based on: #1457
Start review from
pager: introduce timeouts for paging requests.First two commits implement the timeouting logic in the
PagerWorkerabstraction:PageTimeouter, for clean encapsulation.The third commit introduces a (quite basic) integration test for the added functionality. I failed to use tokio timer manual control for this goal, so instead I leverage the proxy to inject delays.
Fixes: #1418
Pre-review checklist
[ ] I have provided docstrings for the public items that I want to introduce.[ ] I have adjusted the documentation in./docs/source/.Fixes:annotations to PR description.