[client] Refactor WG endpoint setup with role-based proxy activation#5277
[client] Refactor WG endpoint setup with role-based proxy activation#5277
Conversation
|
Important Review skippedThis PR was authored by the user configured for CodeRabbit reviews. By default, CodeRabbit skips reviewing PRs authored by this user. It's recommended to use a dedicated user account to post CodeRabbit review feedback. To trigger a single review, invoke the You can disable this status message by setting the Use the checkbox below for a quick retry:
📝 WalkthroughWalkthroughEndpoint configuration API and control flow were changed to support relay-backup awareness and conditional WireGuard proxy activation; workaround and several direct proxy starts were removed or deferred; new endpoint control methods (SwitchWGEndpoint, RemoveEndpointAddress) were added; websocket dependency bumped. Changes
Sequence Diagram(s)sequenceDiagram
participant ICE as ICE
participant Conn as Conn (conn.go)
participant Endpoint as EndpointUpdater (endpoint.go)
participant WGProxy as wgProxy
participant WG as WireGuard
ICE->>Conn: onICEConnectionIsReady()
Conn->>Endpoint: ConfigureWGEndpoint(addr, presharedKey, setEndpointNow)
alt setEndpointNow == true
Endpoint->>WG: update endpoint immediately
Conn->>WGProxy: allow/start Work()
WGProxy->>WG: proxy/redirect to new endpoint
else setEndpointNow == false
Endpoint->>WG: schedule delayed update (nil/placeholder)
Conn->>WGProxy: defer Work() until endpoint applied
Endpoint-->>Conn: later apply real endpoint update
Conn->>WGProxy: start Work() after endpoint applied
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
d83d029 to
a406171
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
client/internal/peer/conn.go (1)
436-445:⚠️ Potential issue | 🟠 MajorDirect
WgInterface.RemoveEndpointAddresscall bypassesEndpointUpdater— pending delayed update may race.Line 443 calls
RemoveEndpointAddressdirectly onWgInterface, bypassing theEndpointUpdater. If a responder delayed update was previously scheduled (from the relay path), it won't be cancelled here and could fire after this removal, re-establishing a stale endpoint.The
ifbranch (line 436) correctly goes throughendpointUpdater.ConfigureWGEndpointwhich cancels delayed updates, but theelsebranch doesn't. Consider usingendpointUpdater.RemoveEndpointAddress()here instead (after fixing it to cancel delayed updates, as noted in the endpoint.go review).The same concern applies to
handleRelayDisconnectedLockedat line 542.
🤖 Fix all issues with AI agents
In `@client/internal/peer/endpoint.go`:
- Around line 68-70: RemoveEndpointAddress currently calls
e.wgConfig.WgInterface.RemoveEndpointAddress without taking e.mu or cancelling
any pending delayed updates, which allows a concurrent scheduleDelayedUpdate
goroutine to call updateWireGuardPeer and re-add the endpoint; fix by acquiring
e.mu at the start (same lock discipline as RemoveWgPeer/SwitchWGEndpoint), call
e.waitForCloseTheDelayedUpdate() before invoking
e.wgConfig.WgInterface.RemoveEndpointAddress(e.wgConfig.RemoteKey), then perform
the removal and release the lock so any pending delayed update is
cancelled/serialized correctly.
🧹 Nitpick comments (4)
client/internal/peer/endpoint.go (3)
72-77:configureAsInitiatoris a trivial pass-through.This method adds no logic beyond what
updateWireGuardPeeralready does. Consider inlining it into theConfigureWGEndpointcaller to reduce indirection, or at least documenting why the separate method exists (e.g., symmetry withconfigureAsResponderfor future extension).
138-142:wgConfigWorkaroundis defined here but only called fromconn.go— consider co-locating or documenting the coupling.This package-level function with a bare
time.Sleephas no documentation of who should call it or when. The AI summary indicates the old call sites inconn.gowere removed, yetconn.goline 402 still calls it. If this is intended to be the single canonical location for the workaround, a brief doc comment explaining the expected caller would help.
37-48: Naming:setEndpointNowis slightly misleading for the initiator path.When
e.initiatoris true,setEndpointNowis ignored entirely (the initiator always updates immediately). This could confuse callers into thinking the flag controls behavior in all cases. A brief doc comment onConfigureWGEndpointclarifying thatsetEndpointNowonly affects the responder path would help.client/internal/peer/conn.go (1)
395-402: Semantics ofhasRelayBackupassetEndpointNow— works correctly but is non-obvious.
hasRelayBackup(whether a relay proxy exists) is passed assetEndpointNowtoConfigureWGEndpoint. This means: if the responder already has a relay backup, skip the delayed-update dance and set the ICE endpoint immediately. The logic is sound (the fallback is already in place), but the semantic mapping is implicit. A one-line comment here would help future readers understand why "has relay backup" ⟹ "set endpoint now".
For relay connections, the controller (initiator) now activates the wgProxy before configuring the WG endpoint, while the non-controller (responder) configures the endpoint first with a delayed update, then activates the proxy after. This prevents the responder from sending traffic through the proxy before WireGuard is ready to receive it, avoiding handshake congestion when both sides try to initiate simultaneously. For ICE connections, pass hasRelayBackup as the setEndpointNow flag so the responder sets the endpoint immediately when a relay fallback exists (avoiding the delayed update path since relay is already available as backup). On ICE disconnect with relay fallback, remove the duplicate wgProxyRelay.Work() calls — the relay proxy is already active from initial setup, so re-activating it is unnecessary. In EndpointUpdater, split ConfigureWGEndpoint into explicit configureAsInitiator and configureAsResponder paths, and add the setEndpointNow parameter to let the caller control whether the responder applies the endpoint immediately or defers it. Add unused SwitchWGEndpoint and RemoveEndpointAddress methods. Remove the wgConfigWorkaround sleep from the relay setup path.
a406171 to
88086fd
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@client/internal/peer/conn.go`:
- Around line 395-401: The code is passing hasRelayBackup as the setEndpointNow
flag to conn.endpointUpdater.ConfigureWGEndpoint which is backwards; when a
relay backup exists we should NOT apply the ICE endpoint immediately. Change the
third argument to the inverse of hasRelayBackup (e.g., setEndpointNow :=
!hasRelayBackup and pass that) when calling
conn.endpointUpdater.ConfigureWGEndpoint in the ICE path (the block using
conn.presharedKey(iceConnInfo.RosenpassPubKey) and
conn.handleConfigurationFailure), so the presence of wgProxyRelay prevents
immediate endpoint application.
|



Update websocket to v1.8.14
Move wgProxy.Work() and wgConfigWorkaround() into EndpointUpdater to centralize the timing of proxy activation relative to WireGuard peer updates. The initiator calls Work() before updating the peer (so the proxy is ready for the immediate handshake), while the responder calls Work() after (to avoid premature activation).
Add SwitchWGEndpoint for relay fallback on ICE disconnect, which skips the initiator/responder distinction since WireGuard roaming wil overwrite the endpoint regardless.
Describe your changes
Issue ticket number and link
Stack
Checklist
Documentation
Select exactly one:
Docs PR URL (required if "docs added" is checked)
Paste the PR link from https://github.com/netbirdio/docs here:
https://github.com/netbirdio/docs/pull/__
Summary by CodeRabbit
New Features
Bug Fixes
Chores