fix(security): first-message WebSocket auth to prevent token leakage#2790
fix(security): first-message WebSocket auth to prevent token leakage#2790simonrosenberg wants to merge 3 commits intomainfrom
Conversation
Clients can now send {"type": "auth", "session_api_key": "..."} as the
first WebSocket frame instead of passing the token as a query parameter.
This keeps sk-oh-* tokens out of reverse-proxy / load-balancer access
logs (Traefik, Datadog, etc.).
Query param and header auth are preserved as deprecated fallbacks for
backwards compatibility.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Python API breakage checks — ✅ PASSEDResult: ✅ PASSED |
REST API breakage checks (OpenAPI) — ✅ PASSEDResult: ✅ PASSED |
all-hands-bot
left a comment
There was a problem hiding this comment.
🟡 Acceptable - Solves a real security problem (tokens in URLs leak to logs) with proper backward compatibility. The three-path auth flow is complex but necessary given WebSocket protocol constraints. A few improvements would help with operational visibility and clarity.
| ws.send_json = AsyncMock() | ||
| ws.close = AsyncMock() | ||
| ws.headers = headers or {} | ||
| return ws |
There was a problem hiding this comment.
🟡 Suggestion: Test gap - No test for the precedence case where BOTH query param AND first-message auth are provided. Based on the code in sockets.py, query param takes precedence (it's checked first at line 113). Add a test verifying this behavior:
@pytest.mark.asyncio
async def test_legacy_query_param_takes_precedence_over_first_message():
"""When both query param and first-message auth are provided, query param wins."""
ws = _make_mock_websocket()
ws.receive_text.return_value = json.dumps(
{"type": "auth", "session_api_key": "sk-oh-different"}
)
with patch("openhands.agent_server.sockets.get_default_config") as mock_config:
mock_config.return_value.session_api_keys = ["sk-oh-valid"]
result = await _accept_authenticated_websocket(ws, session_api_key="sk-oh-valid")
assert result is True
ws.accept.assert_called_once()
# Should NOT read first message because query param already authenticated
ws.receive_text.assert_not_called()Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
all-hands-bot
left a comment
There was a problem hiding this comment.
QA Report: WebSocket First-Message Authentication
Verdict: ⚠️ PASS WITH ISSUES
The WebSocket first-message authentication feature works correctly and provides the intended security improvement. However, there is one minor formatting issue that needs to be fixed before merge.
Environment Setup
✅ Dependencies installed: uv sync --frozen completed successfully
✅ Python environment: Ready (Python 3.13.13)
CI & Test Status
CI Checks
- ✅ Run tests/agent-server-tests: Passed (48s)
- ✅ Run tests/sdk-tests: Passed (9s)
- ✅ Run tests/tools-tests: Passed (9s)
- ✅ Python API breakage checks: Passed
- ✅ REST API breakage checks: Passed
- ❌ Pre-commit checks: Failed due to formatting issues (see Issues Found below)
Local Test Results
$ uv run pytest tests/agent_server/test_websocket_first_message_auth.py -v
========== 14 passed in 0.27s ==========
$ uv run pytest tests/agent_server/ -k "websocket" -v
========== 33 passed, 679 deselected in 8.49s ==========Test Coverage Analysis:
- ✅ All 14 new first-message auth tests pass
- ✅ All 19 existing WebSocket tests pass (no regressions)
- ✅ Comprehensive edge case coverage:
- No auth configured
- Legacy query param auth (valid/invalid)
- Legacy header auth (valid/invalid)
- First-message auth (valid/invalid/malformed/timeout/disconnect)
- End-to-end integration tests
Functional Verification
Code Review
I performed a thorough code review of the implementation:
Security Analysis:
- ✅ No authentication bypass vulnerabilities
- ✅ Proper timeout handling (10 seconds) prevents DoS
- ✅ Generic error messages ("Authentication failed") prevent information leakage
- ✅ Race conditions prevented by synchronous auth before event processing
- ✅ First-message frame properly consumed by auth flow
Logic Verification:
# Auth flow order (as implemented):
1. If no session_api_keys configured → accept unconditionally ✓
2. If legacy key provided (query/header):
- Valid → log deprecation warning, accept ✓
- Invalid → close with 4001 (before accept) ✓
3. If no legacy key → first-message auth:
- Accept connection first (required by WebSocket protocol) ✓
- Read first frame with 10s timeout ✓
- Validate: must be dict with type="auth" and valid key ✓
- Invalid → close with 4001 ✓Backward Compatibility:
- ✅ Legacy query param auth still works (with deprecation warning logged)
- ✅ Legacy header auth still works (with deprecation warning logged)
- ✅ Existing WebSocket tests all pass (no breaking changes)
Test Coverage Completeness:
Test scenarios covered (14 tests):
1. ✓ No auth configured - immediate acceptance
2. ✓ Legacy query param - valid key
3. ✓ Legacy query param - invalid key
4. ✓ Legacy header - valid key
5. ✓ Legacy header - invalid key
6. ✓ First-message auth - valid key
7. ✓ First-message auth - invalid key
8. ✓ First-message auth - wrong type field
9. ✓ First-message auth - missing key field
10. ✓ First-message auth - malformed JSON
11. ✓ First-message auth - client disconnect
12. ✓ First-message auth - timeout (10s)
13. ✓ End-to-end events_socket - valid auth
14. ✓ End-to-end events_socket - rejected auth
Issues Found
🟡 Minor: Line Length Violation
File: tests/agent_server/test_websocket_first_message_auth.py:205
Issue: Comment line exceeds 88 character limit (89 characters)
Impact: Causes pre-commit check to fail; does not affect functionality
Current line (89 chars):
# First call: auth message (read by _accept_authenticated_websocket via receive_text)Suggested fix:
# First call: auth message (read by _accept_authenticated_websocket)
# via receive_text, then receive_json calls disconnectOr simply shorten to:
# First call: auth message read by _accept_authenticated_websocket(The inline comment on the next line already explains "Then receive_json calls: disconnect")
Summary
✅ Security Enhancement: Successfully prevents API keys from appearing in URLs/logs
✅ Backward Compatible: Legacy auth methods work with deprecation warnings
✅ Well Tested: 14 new tests + 19 existing tests all pass
✅ Clean Implementation: Proper error handling, timeouts, and validation
🟡 Minor Fix Needed: One line length violation in test file
Recommendation: Fix the line length issue and merge. The feature is functionally complete and ready for production use.
- Add comment explaining 10-second timeout rationale - Add comment explaining why accept() must precede first-message read - Log specific failure reasons (timeout, malformed JSON, disconnect, wrong type, invalid key) instead of a single generic message - Log info on successful first-message auth for adoption tracking - Add test for query-param-takes-precedence-over-first-message case Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary
_accept_authenticated_websocketnow supports first-message auth: when no key is found in query params or headers, the server accepts the WebSocket, reads the first frame, and validates{"type": "auth", "session_api_key": "..."}before processing any other messages. 10-second timeout for the auth frame._safe_close_websocketnow accepts optionalcode/reasonparameters.events_socket.Addresses the server-side component of OpenHands/OpenHands#13777 — moves the first-message auth from the V0
listen_socket.pyto the SDK as requested in tofarr's review.Test plan
{"type": "auth", ...}as first frame — will work once this SDK is released and the submodule is bumped🤖 Generated with Claude Code
Agent Server images for this PR
• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server
Variants & Base Images
eclipse-temurin:17-jdknikolaik/python-nodejs:python3.13-nodejs22-slimgolang:1.21-bookwormPull (multi-arch manifest)
# Each variant is a multi-arch manifest supporting both amd64 and arm64 docker pull ghcr.io/openhands/agent-server:e7987de-pythonRun
All tags pushed for this build
About Multi-Architecture Support
e7987de-python) is a multi-arch manifest supporting both amd64 and arm64e7987de-python-amd64) are also available if needed