-
Notifications
You must be signed in to change notification settings - Fork 141
Description
The authmailbox client retries initial connection establishment, but does not retry the subscription/authentication handshake. Add a backoff retry loop around the subscription handshake so transient auth/stream failures don’t permanently fail StartAccountSubscription.
Background / context
- The connection retry logic exists in
authmailbox/receive_subscription.go(connectServerStreamuses MinBackoff/MaxBackoff and MaxConnectAttempts aroundMailboxInfo). - The subscription/auth handshake happens in
authmailbox/receive_subscription.go(connectAndAuthenticatesendsInitReceive, waits for challenge, sendsAuthSig, then waits forAuthSuccess). - Proof couriers already implement backoff-retry for delivery/receive in
proof/courier.go(BackoffHandler.Exec) and the generic retry helper lives infn/retry.go.
Problem
If the subscription/auth handshake fails (e.g. stream error, server closes due to auth timeout, transient RPC error), connectAndAuthenticate returns an error and the subscription attempt stops without retrying. This makes startup/subscribe brittle even when the server is reachable and would succeed on a subsequent attempt.
Observed flow (client):
connectServerStreamsucceeds.- Client sends
InitReceive. - Handshake fails before
AuthSuccess(timeout, stream error, etc.). StartAccountSubscriptionreturns error; no backoff retry is attempted.
Proposed change
Add an exponential backoff retry loop around the auth subscription step.
High-level behavior:
- On handshake failure, close/cancel the stream, wait with backoff, then retry the whole subscription handshake.
- Respect context cancellation and client shutdown.
- Bound attempts using existing config (or add a dedicated auth retry config if necessary).
Implementation sketch
Suggested approach (minimal config changes):
- Extend
receiveSubscription.connectAndAuthenticateto loop for up toMaxConnectAttemptsattempts (or a newMaxAuthAttemptsif separation is preferred). - For each attempt:
- Call
connectServerStream(keep its own connection retry/backoff). - Perform the auth handshake (
InitReceive-> wait forAuthSuccess). - If handshake fails:
- Call
closeStream(or cancel stream ctx) soserverStreamis nil and the read goroutine can exit. - Backoff (start at
MinBackoff, double toMaxBackoff). - Retry unless context is done or client is shutting down.
- Call
- Call
- Consider using
fn.RetryFuncNfor the retry loop, or keep local backoff logic likeconnectServerStream. - Ensure
authOkChananderrChandon’t leak state across attempts (e.g., drain/reset per attempt or re-create per attempt). - Optional: add a client-side auth timeout (<= server
AuthTimeout) to avoid indefinite waits if the stream stays open but no challenge arrives.
Files to touch
authmailbox/receive_subscription.go(main change: add handshake retry/backoff)authmailbox/client.go(if config changes or helper methods are added)authmailbox/mock.go/authmailbox/client_test.go(tests)fn/retry.go(only if choosing to reuseRetryFuncNand need config/plumbing)
Test plan
Add/extend tests in authmailbox/client_test.go or a new test file:
- Simulate a server that fails the first auth handshake (e.g., reject signature or force auth timeout) and then succeeds; verify client eventually becomes subscribed.
- Verify that transient handshake failures trigger retries with backoff (use short backoff in tests).
- Ensure
IsSubscribedstays false untilAuthSuccessis received, and becomes true after a successful retry. - Confirm
Stop()cancels any pending retry loop without leaks.
Acceptance criteria
- Subscription/auth handshake failures are retried with exponential backoff.
- Retries stop on context cancel or client shutdown.
- Stream is cleaned up between attempts (no goroutine leaks;
IsSubscribedreflects true only on successful auth). - Existing connection retry and server-restart reconnect logic continue to work.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status