Skip to content

Conversation

@khaliqgant
Copy link
Collaborator

The main dashboard WebSocket (/ws) and bridge WebSocket (/ws/bridge)
were missing ping/pong keepalive mechanisms that prevent TCP/proxy
timeouts from killing idle connections.

The logs and presence WebSocket endpoints already had 30-second ping
intervals, but the main dashboard connections did not. This caused
workspace instances to experience WebSocket disconnections when:

  • TCP idle timeouts kicked in (typically 60-120 seconds)
  • Load balancers/proxies dropped idle connections (30-60 seconds)
  • Cloud providers (Fly, Railway) enforced connection timeouts

Added:

  • 30-second ping interval for main dashboard WebSocket
  • 30-second ping interval for bridge WebSocket
  • Proper cleanup of intervals on server close
  • Graceful close (code 1000) for unresponsive clients

claude added 4 commits January 6, 2026 18:16
The main dashboard WebSocket (/ws) and bridge WebSocket (/ws/bridge)
were missing ping/pong keepalive mechanisms that prevent TCP/proxy
timeouts from killing idle connections.

The logs and presence WebSocket endpoints already had 30-second ping
intervals, but the main dashboard connections did not. This caused
workspace instances to experience WebSocket disconnections when:
- TCP idle timeouts kicked in (typically 60-120 seconds)
- Load balancers/proxies dropped idle connections (30-60 seconds)
- Cloud providers (Fly, Railway) enforced connection timeouts

Added:
- 30-second ping interval for main dashboard WebSocket
- 30-second ping interval for bridge WebSocket
- Proper cleanup of intervals on server close
- Graceful close (code 1000) for unresponsive clients
For completeness, added 30-second ping/pong keepalive to the local
daemon WebSocket servers:

- daemon/api.ts: Daemon API WebSocket for dashboard communication
- daemon/orchestrator.ts: Orchestrator WebSocket for multi-workspace

These are local (localhost) connections so less prone to proxy timeouts,
but adding keepalive ensures consistent behavior across all WebSocket
endpoints.

Changes per file:
- Add clientAlive WeakMap and pingInterval properties
- Setup 30-second ping interval in start()
- Clean up interval in stop()
- Handle pong responses in connection handler
- Mark clients as alive on connect
@khaliqgant khaliqgant merged commit 0ee58fb into main Jan 6, 2026
9 checks passed
@khaliqgant khaliqgant deleted the claude/fix-workspace-disconnections-c4F68 branch January 6, 2026 18:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants