Skip to content

Comments

fix: interrupt retry sleep on ctx cancel#144

Open
mohitsethia wants to merge 2 commits intogojek:masterfrom
mohitsethia:fix-interrupt-retry-sleep-on-ctx-cancel
Open

fix: interrupt retry sleep on ctx cancel#144
mohitsethia wants to merge 2 commits intogojek:masterfrom
mohitsethia:fix-interrupt-retry-sleep-on-ctx-cancel

Conversation

@mohitsethia
Copy link

@mohitsethia mohitsethia commented Mar 30, 2025

Summary

Currently, when an HTTP request is retried, it sleeps for backoffDuration before making another attempt. However, if the request's context (ctx) gets cancelled during this sleep period, the function does not exit immediately and instead continues to sleep until the duration completes.

This PR fixes the issue by checking for context cancellation during the sleep and interrupting it early, returning an appropriate error instead of waiting unnecessarily.

Changes Introduced

  • Added a check to interrupt sleep if the context is cancelled.
  • Ensured the function exits immediately when cancellation is detected, improving responsiveness.
  • Added/updated test cases to verify correct behavior.

Impact

  • Prevents unnecessary delays when retries are no longer needed.
  • Improves efficiency in request handling.
  • Aligns behavior with expected cancellation semantics.

Testing

  • Verified the fix with unit tests covering both normal and cancelled scenarios.
  • Ensured no regressions in retry logic.

@mohitsethia mohitsethia force-pushed the fix-interrupt-retry-sleep-on-ctx-cancel branch from e52adbc to 55096d1 Compare March 30, 2025 18:10
@mohitsethia
Copy link
Author

Hi @rShetty @sohamkamani @devdinu @gwthm-in , not sure whom I can tag, can you guys please help review this? Thanks!


// SleepInterruptible sleeps until either the timer triggers or context is cancelled
func SleepInterruptible(ctx context.Context, d time.Duration) error {
t := time.NewTimer(d)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given we are updating min golang version to 1.24 in #147. It would be simpler to use https://pkg.go.dev/time#After here

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, thanks for the suggestion.

Even with Go ≥1.24, time.After still relies on GC to reclaim the underlying timer if the context is cancelled before the timeout fires. In high-throughput paths, that makes the timer’s lifetime non-deterministic and can leave timers running longer than necessary.

Using time.NewTimer lets us stop the timer explicitly when ctx.Done() wins, which gives us deterministic cleanup and avoids unnecessary background timers under load. For this reason, I think NewTimer is still the better fit here despite the higher minimum Go version.

Would love to hear your thoughts.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using time.NewTimer is beneficial if we plan to reuse the timer. For one-off sleep scenarios (as in this case, post Go 1.23), there’s no need to call Stop

The runtime’s Stop implementation notes it can only be marked as stopped, and not removed from the heap (ref). In contrast, when the channel is no longer blocked, the runtime actually removes the channel from timer heap (ref). Exiting early, as done here, has more control on the timer’s lifecycle then explicitly calling Stop.

So unless we need to reuse the timer, time.After is a simpler and equally effective choice. And even if we could reuse time.NewTimer, we still need to consider whether the added complexity is justified for optimizing a Infrequent failure case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants