fix: interrupt retry sleep on ctx cancel#144
fix: interrupt retry sleep on ctx cancel#144mohitsethia wants to merge 2 commits intogojek:masterfrom
Conversation
e52adbc to
55096d1
Compare
|
Hi @rShetty @sohamkamani @devdinu @gwthm-in , not sure whom I can tag, can you guys please help review this? Thanks! |
|
|
||
| // SleepInterruptible sleeps until either the timer triggers or context is cancelled | ||
| func SleepInterruptible(ctx context.Context, d time.Duration) error { | ||
| t := time.NewTimer(d) |
There was a problem hiding this comment.
Given we are updating min golang version to 1.24 in #147. It would be simpler to use https://pkg.go.dev/time#After here
There was a problem hiding this comment.
Hi, thanks for the suggestion.
Even with Go ≥1.24, time.After still relies on GC to reclaim the underlying timer if the context is cancelled before the timeout fires. In high-throughput paths, that makes the timer’s lifetime non-deterministic and can leave timers running longer than necessary.
Using time.NewTimer lets us stop the timer explicitly when ctx.Done() wins, which gives us deterministic cleanup and avoids unnecessary background timers under load. For this reason, I think NewTimer is still the better fit here despite the higher minimum Go version.
Would love to hear your thoughts.
There was a problem hiding this comment.
Using time.NewTimer is beneficial if we plan to reuse the timer. For one-off sleep scenarios (as in this case, post Go 1.23), there’s no need to call Stop
The runtime’s Stop implementation notes it can only be marked as stopped, and not removed from the heap (ref). In contrast, when the channel is no longer blocked, the runtime actually removes the channel from timer heap (ref). Exiting early, as done here, has more control on the timer’s lifecycle then explicitly calling Stop.
So unless we need to reuse the timer, time.After is a simpler and equally effective choice. And even if we could reuse time.NewTimer, we still need to consider whether the added complexity is justified for optimizing a Infrequent failure case.
Summary
Currently, when an HTTP request is retried, it sleeps for backoffDuration before making another attempt. However, if the request's context (ctx) gets cancelled during this sleep period, the function does not exit immediately and instead continues to sleep until the duration completes.
This PR fixes the issue by checking for context cancellation during the sleep and interrupting it early, returning an appropriate error instead of waiting unnecessarily.
Changes Introduced
Impact
Testing