Skip to content

Conversation

@rbtr
Copy link
Collaborator

@rbtr rbtr commented Mar 27, 2025

#3529 refactored the HNS restart such that the stop/start could be retried - this adds a deadline to fail and cause that retry, and changes the retry strategy to exponential backoff instead of constant

@rbtr rbtr force-pushed the fix/backoff-retry-hns branch 2 times, most recently from 60c3734 to 6c690ed Compare March 27, 2025 19:02
@rbtr
Copy link
Collaborator Author

rbtr commented Mar 27, 2025

/azp run Azure Container Networking PR

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@rbtr rbtr marked this pull request as ready for review March 27, 2025 19:02
Copilot AI review requested due to automatic review settings March 27, 2025 19:02
@rbtr rbtr requested a review from a team as a code owner March 27, 2025 19:02
@rbtr rbtr requested a review from sanamsarath March 27, 2025 19:02
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR addresses backoff and retry improvements to enhance the stability of HNS restart operations on Windows.

  • Introduces exponential backoff delay in retry operations for stopping and starting the HNS service.
  • Refactors timeout creation in the tryStartServiceFn and tryStopServiceFn functions by using a deadline function.
Comments suppressed due to low confidence (4)

platform/os_windows.go:335

  • [nitpick] The variables 'n' and 'limit' are ambiguous; consider renaming them to 'attemptCount' and 'maxAttempts' to improve clarity.
var n, limit time.Duration = 0, 3

platform/os_windows.go:355

  • [nitpick] The variable 'deadline' shadows the 'deadline' function; consider renaming it (e.g., 'ctxWithTimeout') to avoid confusion.
deadline, cancel := deadline(ctx)

platform/os_windows.go:382

  • [nitpick] Consider renaming 'n' and 'limit' to 'attemptCount' and 'maxAttempts' respectively for consistency and clarity in this context.
var n, limit time.Duration = 0, 3

platform/os_windows.go:402

  • [nitpick] The reuse of the name 'deadline' causes shadowing of the deadline function; renaming it to something like 'ctxWithTimeout' could reduce ambiguity.
deadline, cancel := deadline(ctx)

@rbtr rbtr force-pushed the fix/backoff-retry-hns branch from 6c690ed to 4ed0f83 Compare March 27, 2025 19:12
@rbtr rbtr requested a review from Copilot March 27, 2025 19:13
@rbtr
Copy link
Collaborator Author

rbtr commented Mar 27, 2025

/azp run Azure Container Networking PR

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors the HNS restart process by introducing deadlines for service stop/start operations and using an exponential backoff retry strategy. Key changes include:

  • Adding a delay type for exponential backoff in the retry calls.
  • Implementing deadlines with a 90-second timeout for both stopping and starting the service.
  • Modifying the error handling to use the deadline's signal instead of the original context cancellation.
Comments suppressed due to low confidence (1)

platform/os_windows.go:361

  • The error wrap message 'context cancelled' may be misleading for a timeout scenario. Consider changing it to 'deadline exceeded' to better reflect a timeout condition.
case <-deadline.Done():

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@rbtr rbtr force-pushed the fix/backoff-retry-hns branch from 4ed0f83 to 1a397d8 Compare March 27, 2025 20:07
@QxBytes
Copy link
Contributor

QxBytes commented Mar 27, 2025

/azp run Azure Container Networking PR

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@rbtr rbtr enabled auto-merge March 27, 2025 21:09
@rbtr rbtr added this pull request to the merge queue Mar 27, 2025
Merged via the queue into master with commit 20d09ab Mar 27, 2025
66 of 89 checks passed
@rbtr rbtr deleted the fix/backoff-retry-hns branch March 27, 2025 23:36
rbtr added a commit that referenced this pull request Apr 4, 2025
github-merge-queue bot pushed a commit that referenced this pull request May 2, 2025
sivakami-projects pushed a commit that referenced this pull request Oct 23, 2025
fix: backoff retry and timeouts in HNS restart

Signed-off-by: Evan Baker <[email protected]>
vipul-21 pushed a commit that referenced this pull request Oct 23, 2025
fix: backoff retry and timeouts in HNS restart

Signed-off-by: Evan Baker <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants