-
Notifications
You must be signed in to change notification settings - Fork 2.5k
fix(pool): wip, pool reauth should not interfere with handoff #3547
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
ef20907
to
a65a724
Compare
a65a724
to
5fe0bfa
Compare
c9e9e6c
to
8a629fb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR addresses a concurrency issue where pool re-authentication could interfere with connection handoff operations during maintenance notifications. The fix introduces connection state management using Usable
and Used
flags to coordinate between background operations (re-authentication) and handoff processes.
Key changes:
- Added connection-level credentials listener caching to prevent duplicate listeners per connection
- Introduced
Used
flag alongside existingUsable
flag to prevent concurrent command execution on connections - Modified re-authentication flow to wait for connections to be available before proceeding
Reviewed Changes
Copilot reviewed 11 out of 12 changed files in this pull request and generated 5 comments.
Show a summary per file
File | Description |
---|---|
redis.go | Added credentials listener management with caching per connection and modified re-authentication callbacks to accept connection parameter |
internal/pool/conn.go | Added Used flag and reorganized connection state fields, modified handoff marking logic to acquire connection before state changes |
internal/pool/pool.go | Updated connection usage tracking with Used flag in pool operations |
internal/pool/pool_single.go | Added Used flag management for single connection pools |
auth/conn_reauth_credentials_listener.go | New file implementing connection-aware credentials listener with timeout-based waiting for connection availability |
error.go | Added ErrConnUnusableTimeout to bad connection error handling |
redis_test.go | Added mutex protection to mock streaming provider |
sentinel_test.go | Renamed test function for clarity |
internal/pool/pool_test.go | Adjusted test expectations for dial retry behavior |
maintnotifications/handoff_worker.go | Added formatting and clarifying comments |
internal/pool/buffer_size_test.go | Updated unsafe pointer field layout to match new connection structure |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 12 out of 13 changed files in this pull request and generated 4 comments.
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
// the delay will be 1, 2, 4... up to 512 microseconds | ||
// Moving this to the top of the loop to avoid "continue" without delay | ||
if attempt > 0 && attempt < maxRetries-1 { | ||
delay := baseDelay * time.Duration(1<<uint(attempt%10)) // Cap exponential growth |
Copilot
AI
Oct 15, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The magic number 10
for capping exponential growth is unexplained. Add a named constant like maxExponentShift = 10
with a comment explaining why this value prevents excessive delays while maintaining backoff effectiveness.
Copilot uses AI. Check for mistakes.
err = pool.ErrConnUnusableTimeout | ||
default: | ||
// small sleep to avoid busy looping | ||
time.Sleep(100 * time.Microsecond) |
Copilot
AI
Oct 15, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The sleep duration 100 * time.Microsecond
is duplicated and hard-coded. Extract to a named constant like checkUsablePollInterval = 100 * time.Microsecond
for better maintainability and to make the polling strategy more explicit.
Copilot uses AI. Check for mistakes.
conn: conn, | ||
reAuth: reAuth, | ||
onErr: onErr, | ||
checkUsableTimeout: 1 * time.Second, |
Copilot
AI
Oct 15, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default timeout of 1 * time.Second
is hard-coded. Extract to a named constant like defaultCheckUsableTimeout = 1 * time.Second
to make it easier to maintain and document the default behavior.
Copilot uses AI. Check for mistakes.
Co-authored-by: Copilot <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 12 out of 13 changed files in this pull request and generated 1 comment.
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
Co-authored-by: Copilot <[email protected]>
5f8a0b9
to
0b2baa1
Compare
0b2baa1
to
acb55d8
Compare
6cbbdd5
to
d74671b
Compare
be6c8b6
to
4eac8f8
Compare
4eac8f8
to
afba8c2
Compare
11afe91
to
f14095b
Compare
r.shouldReAuthLock.Lock() | ||
delete(r.shouldReAuth, conn.GetID()) | ||
r.shouldReAuthLock.Unlock() | ||
r.ClearReAuthMark(conn.GetID()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not needed
r.scheduledLock.RUnlock() | ||
// has scheduled reauth, reject the connection | ||
if ok && hasScheduled { | ||
// simply reject the connection, it will be re-authenticated in OnPut |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
incorrect comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 22 out of 23 changed files in this pull request and generated 2 comments.
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 22 out of 23 changed files in this pull request and generated 4 comments.
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
} else { | ||
// Release Usable and retry | ||
conn.Usable.Store(true) | ||
time.Sleep(time.Millisecond) |
Copilot
AI
Oct 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed 1ms sleep in tight retry loop can cause unnecessary delays and poor performance. Consider using exponential backoff or a shorter initial delay (e.g., 10-100 microseconds) with gradual increase, similar to the pattern used in MarkQueuedForHandoff
at line 490.
Copilot uses AI. Check for mistakes.
time.Sleep(time.Millisecond) | ||
} | ||
} else { | ||
time.Sleep(time.Millisecond) |
Copilot
AI
Oct 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed 1ms sleep in tight retry loop can cause unnecessary delays and poor performance. This should use exponential backoff similar to the retry pattern in MarkQueuedForHandoff
to reduce contention and improve responsiveness.
Copilot uses AI. Check for mistakes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 22 out of 23 changed files in this pull request and generated 3 comments.
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
if p.stickyErr != nil { | ||
return nil, p.stickyErr | ||
} | ||
if p.cn == nil { | ||
return nil, ErrClosed | ||
} |
Copilot
AI
Oct 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The nil check for p.cn
is performed after checking p.stickyErr
, which means if both p.cn
is nil and p.stickyErr
is set, the function will return the sticky error instead of ErrClosed
. This could mask the fact that the connection is nil. Consider checking p.cn
for nil before checking p.stickyErr
.
if p.stickyErr != nil { | |
return nil, p.stickyErr | |
} | |
if p.cn == nil { | |
return nil, ErrClosed | |
} | |
if p.cn == nil { | |
return nil, ErrClosed | |
} | |
if p.stickyErr != nil { | |
return nil, p.stickyErr | |
} |
Copilot uses AI. Check for mistakes.
@@ -1,9 +1,8 @@ | |||
package pool_test | |||
|
|||
import ( |
Copilot
AI
Oct 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The import of net
package was removed, but net.Conn
is still referenced in the comment at line 142 as 'net.Conn'. While this is just a comment, it could cause confusion. Consider either keeping the import or updating the comment to remove the package reference.
Copilot uses AI. Check for mistakes.
} else { | ||
// Release Usable and retry | ||
conn.Usable.Store(true) | ||
time.Sleep(time.Millisecond) |
Copilot
AI
Oct 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The busy-wait loop with a fixed 1ms sleep could cause unnecessary CPU usage and delays. Consider using exponential backoff similar to the pattern used in MarkQueuedForHandoff()
in conn.go, starting with shorter intervals and increasing gradually.
Copilot uses AI. Check for mistakes.
No description provided.