-
Notifications
You must be signed in to change notification settings - Fork 473
Transcribe-proxy replay testing #2438
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
✅ Deploy Preview for hyprnote ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
✅ Deploy Preview for hyprnote-storybook ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
📝 WalkthroughWalkthroughAdds configurable upstream WebSocket URLs to the STT proxy and routes to them when set. Introduces test infrastructure: WebSocket recording/fixtures, a mock upstream server, helpers, and extensive replay/integration tests for provider behaviors and error scenarios. Changes
Sequence DiagramsequenceDiagram
participant Client
participant Proxy
participant MockUpstream as MockUpstream
participant Recorder as Recording
Client->>Proxy: WebSocket connect / start session
activate Proxy
alt upstream URL configured for provider
Proxy->>MockUpstream: Connect to configured upstream URL
else
Proxy->>Proxy: Perform Auth / Session Init
Proxy->>MockUpstream: Connect to provider default URL
end
activate MockUpstream
loop replay fixture messages
MockUpstream->>Proxy: ws message (text/binary/ping/pong/close)
Proxy->>Client: forward message
alt recording enabled
Proxy->>Recorder: record message (JSONL)
end
end
MockUpstream->>Proxy: close
Proxy->>Client: close
deactivate MockUpstream
deactivate Proxy
opt recording enabled
Recorder->>Recorder: save JSONL to fixtures directory
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes
Possibly related PRs
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (5)
crates/transcribe-proxy/tests/common/recording.rs (2)
187-188: Optional: Consider documenting the timestamp range limitation.The cast from
u128tou64milliseconds could theoretically overflow after ~584 million years of recording. While this is not a practical concern for test recordings, consider adding a brief comment documenting this limitation for clarity.💡 Optional clarification
pub fn elapsed_ms(&self) -> u64 { + // Note: Limited to ~584 million years; overflow is not a practical concern for test recordings self.start_time.elapsed().as_millis() as u64 }
207-239: Consider documenting mutex panic behavior.Lines 222 and 227 use
unwrap()onMutex::lock(), which will panic if the mutex is poisoned (from a panic while locked). While this is generally acceptable in test code where panics are visible and actionable, consider whether documenting this behavior or usingexpect()with a message would improve debuggability.💡 Alternative using expect()
pub fn record_server_text(&self, content: &str) { - let mut recorder = self.recorder.lock().unwrap(); + let mut recorder = self.recorder.lock().expect("recorder mutex poisoned"); recorder.record_text(Direction::ServerToClient, content); } pub fn save_to_file(&self, dir: impl AsRef<Path>, suffix: &str) -> std::io::Result<()> { - let recorder = self.recorder.lock().unwrap(); + let recorder = self.recorder.lock().expect("recorder mutex poisoned"); let recording = recorder.recording();crates/transcribe-proxy/tests/providers_e2e.rs (2)
56-62: Remove descriptive comments.Per coding guidelines, comments should be avoided or explain "Why" rather than "What". The comment on line 57 and line 83 describe what the code does, which is already clear from the code itself.
Suggested change
Ok(response) => { - // Record the response if recording is enabled if let Some(ref session) = recording_session { if let Ok(json) = serde_json::to_string(&response) { session.record_server_text(&json); } }
83-92: Remove descriptive comment.Same as above - remove the "Save recording if enabled" comment per coding guidelines.
Suggested change
- // Save recording if enabled if let Some(session) = recording_session { if let Some(ref output_dir) = recording_opts.output_dir {crates/transcribe-proxy/tests/replay.rs (1)
84-87: Close frame assertion may silently pass on missing close.The
if let Some((code, _reason))pattern means the test passes if no close frame is received. If the proxy fails to forward the close frame, this wouldn't catch it.Consider asserting close frame presence
- if let Some((code, _reason)) = close_info { - assert_eq!(code, 1000, "Expected normal close code 1000"); - } + let (code, _reason) = close_info.expect("Expected close frame"); + assert_eq!(code, 1000, "Expected normal close code 1000");
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (13)
crates/transcribe-proxy/src/config.rs(3 hunks)crates/transcribe-proxy/src/routes/streaming.rs(1 hunks)crates/transcribe-proxy/tests/common/fixtures.rs(1 hunks)crates/transcribe-proxy/tests/common/mock_upstream.rs(1 hunks)crates/transcribe-proxy/tests/common/mod.rs(2 hunks)crates/transcribe-proxy/tests/common/recording.rs(1 hunks)crates/transcribe-proxy/tests/fixtures/deepgram_auth_error.jsonl(1 hunks)crates/transcribe-proxy/tests/fixtures/deepgram_normal.jsonl(1 hunks)crates/transcribe-proxy/tests/fixtures/deepgram_rate_limit.jsonl(1 hunks)crates/transcribe-proxy/tests/fixtures/soniox_error.jsonl(1 hunks)crates/transcribe-proxy/tests/fixtures/soniox_normal.jsonl(1 hunks)crates/transcribe-proxy/tests/providers_e2e.rs(4 hunks)crates/transcribe-proxy/tests/replay.rs(1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*
📄 CodeRabbit inference engine (AGENTS.md)
Format using
dprint fmtfrom the root. Do not usecargo fmt.
Files:
crates/transcribe-proxy/src/routes/streaming.rscrates/transcribe-proxy/tests/fixtures/soniox_normal.jsonlcrates/transcribe-proxy/tests/common/fixtures.rscrates/transcribe-proxy/tests/fixtures/deepgram_rate_limit.jsonlcrates/transcribe-proxy/tests/replay.rscrates/transcribe-proxy/src/config.rscrates/transcribe-proxy/tests/fixtures/deepgram_auth_error.jsonlcrates/transcribe-proxy/tests/common/mock_upstream.rscrates/transcribe-proxy/tests/fixtures/deepgram_normal.jsonlcrates/transcribe-proxy/tests/common/recording.rscrates/transcribe-proxy/tests/fixtures/soniox_error.jsonlcrates/transcribe-proxy/tests/common/mod.rscrates/transcribe-proxy/tests/providers_e2e.rs
**/*.{ts,tsx,rs,js,jsx}
📄 CodeRabbit inference engine (AGENTS.md)
By default, avoid writing comments at all. If you write one, it should be about 'Why', not 'What'.
Files:
crates/transcribe-proxy/src/routes/streaming.rscrates/transcribe-proxy/tests/common/fixtures.rscrates/transcribe-proxy/tests/replay.rscrates/transcribe-proxy/src/config.rscrates/transcribe-proxy/tests/common/mock_upstream.rscrates/transcribe-proxy/tests/common/recording.rscrates/transcribe-proxy/tests/common/mod.rscrates/transcribe-proxy/tests/providers_e2e.rs
🧬 Code graph analysis (6)
crates/transcribe-proxy/src/routes/streaming.rs (1)
crates/transcribe-proxy/src/routes/mod.rs (1)
provider(23-25)
crates/transcribe-proxy/tests/common/fixtures.rs (1)
crates/transcribe-proxy/tests/common/recording.rs (2)
recording(202-204)from_jsonl_file(118-122)
crates/transcribe-proxy/tests/replay.rs (3)
crates/transcribe-proxy/tests/common/fixtures.rs (1)
load_fixture(11-14)crates/transcribe-proxy/tests/common/mock_upstream.rs (3)
start_mock_server_with_config(173-198)fast(29-31)default(20-25)crates/transcribe-proxy/tests/common/mod.rs (1)
start_server_with_upstream_url(61-70)
crates/transcribe-proxy/tests/common/recording.rs (2)
crates/transcribe-proxy/src/config.rs (1)
new(21-29)crates/transcribe-proxy/src/routes/mod.rs (1)
provider(23-25)
crates/transcribe-proxy/tests/common/mod.rs (6)
crates/transcribe-proxy/tests/common/recording.rs (2)
recording(202-204)new(214-219)crates/transcribe-proxy/tests/common/fixtures.rs (1)
load_fixture(11-14)crates/transcribe-proxy/tests/common/mock_upstream.rs (1)
start_mock_server_with_config(173-198)crates/transcribe-proxy/src/relay/builder.rs (1)
upstream_url(112-117)apps/ai/src/env.rs (1)
api_keys(47-49)crates/transcribe-proxy/src/config.rs (1)
new(21-29)
crates/transcribe-proxy/tests/providers_e2e.rs (1)
crates/transcribe-proxy/tests/common/recording.rs (3)
recording(202-204)from_env(250-272)new(214-219)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: Redirect rules - hyprnote
- GitHub Check: Header rules - hyprnote
- GitHub Check: Pages changed - hyprnote
- GitHub Check: desktop_ci (linux, depot-ubuntu-24.04-8)
- GitHub Check: desktop_ci (macos, depot-macos-14)
- GitHub Check: desktop_ci (linux, depot-ubuntu-22.04-8)
- GitHub Check: fmt
🔇 Additional comments (22)
crates/transcribe-proxy/tests/common/recording.rs (4)
11-26: LGTM: Well-designed message type system.The
DirectionandMessageKindenums provide a clean abstraction for WebSocket message directionality and types. The serde annotations ensure proper JSON serialization for the JSONL fixture format.
28-110: LGTM: Comprehensive message constructors and helpers.The
WsMessagestruct and its constructors handle all WebSocket message types appropriately. Base64 encoding for binary data ensures safe storage in JSONL format. The helper methods (is_server_message,is_client_message,decode_binary) provide convenient access patterns.
112-169: LGTM: Robust recording container with flexible I/O.The
WsRecordingimplementation handles JSONL parsing with proper filtering of comments and empty lines. Thetransformmethod provides a functional API for message manipulation, and the iterator helpers support test scenarios efficiently.
241-273: LGTM: Clean environment-driven configuration.The
RecordingOptions::from_env()method provides a clear mechanism to enable recording via theRECORD_FIXTURESenvironment variable, with appropriate path resolution to the test fixtures directory. The fallback behavior when recording is disabled is sensible.crates/transcribe-proxy/tests/fixtures/deepgram_auth_error.jsonl (1)
1-2: LGTM: Appropriate auth error fixture.The fixture correctly models a Deepgram authentication failure with an
INVALID_AUTHerror message followed by a WebSocket close with code 1008 (Policy Violation), which is the appropriate close code for authentication failures.crates/transcribe-proxy/tests/fixtures/deepgram_rate_limit.jsonl (1)
1-2: LGTM: Valid rate limit error fixture.The fixture appropriately represents a Deepgram rate limit scenario with a
TOO_MANY_REQUESTSerror followed by connection closure. The structure is consistent with other error fixtures.crates/transcribe-proxy/tests/fixtures/soniox_error.jsonl (1)
1-2: LGTM: Appropriate Soniox error fixture.The fixture correctly models a Soniox server error with error code 503 and a WebSocket close with code 1011 (Internal Server Error). The error message structure reflects Soniox's API format, which differs from Deepgram's.
crates/transcribe-proxy/tests/fixtures/soniox_normal.jsonl (1)
1-7: LGTM: Comprehensive success scenario fixture.The fixture provides realistic coverage of a normal Soniox transcription session, including progressive interim results, finalized transcripts, metadata, and proper connection closure with code 1000. This will enable thorough testing of successful transcription flows.
crates/transcribe-proxy/tests/common/fixtures.rs (1)
1-14: LGTM: Clean fixture loading utilities.The helpers provide a straightforward API for accessing test fixtures. The use of
unwrap()on Line 13 is acceptable in test code, as fixture loading failures should cause immediate test failures with clear panic messages.crates/transcribe-proxy/src/config.rs (1)
17-17: LGTM: Clean configuration extension.The addition of
upstream_urlsfollows the existing configuration patterns:
- Properly initialized in the constructor
- Builder method
with_upstream_urlis consistent with otherwith_*methods- Accessor
upstream_url_formirrors the existingapi_key_forpatternThis provides a clean mechanism for test infrastructure to override upstream endpoints.
Also applies to: 27-27, 46-49, 55-57
crates/transcribe-proxy/src/routes/streaming.rs (1)
41-60: Verify that upstream URL overrides are restricted to test environments.The custom upstream URL path bypasses the authentication flow, which is appropriate for testing with mock servers. However, ensure that this configuration is only used in test/development environments and cannot be set in production deployments.
Run the following script to check how upstream URLs are configured:
#!/bin/bash # Description: Check where and how upstream URLs are set to ensure they're only used in tests # Search for with_upstream_url usage echo "=== Usage of with_upstream_url ===" rg -n --type rust 'with_upstream_url' -C 3 # Search for upstream_urls field access echo -e "\n=== Direct upstream_urls access ===" rg -n --type rust 'upstream_urls' -C 2 # Check for any production config that might set upstream URLs echo -e "\n=== Production config files ===" fd -e toml -e yaml -e json -x cat {} \; -x echo "File: {}" \;crates/transcribe-proxy/tests/fixtures/deepgram_normal.jsonl (1)
1-7: Well-structured test fixture.The JSONL fixture correctly models a Deepgram streaming lifecycle with progressive transcript updates, metadata, and a normal close frame. Timestamps are monotonically increasing and the close code 1000 aligns with the standard WebSocket normal closure.
crates/transcribe-proxy/tests/replay.rs (4)
15-27: LGTM!Clean helper function for establishing WebSocket connections in tests.
29-60: LGTM!The message collection logic correctly handles text messages, close frames, and errors with a timeout wrapper.
202-229: LGTM!Good test for verifying message count parity between fixture and proxy output.
231-256: Consider adding assertion after disconnect.The test verifies the proxy accepts a client disconnect without crashing, but doesn't assert any post-disconnect behavior. This is acceptable for a basic smoke test.
crates/transcribe-proxy/tests/common/mod.rs (2)
3-12: LGTM!Clean module structure with appropriate re-exports for test utilities.
61-70: LGTM!Clean helper that properly configures the proxy with a mock API key and upstream URL override for testing.
crates/transcribe-proxy/tests/common/mock_upstream.rs (4)
28-31: Consider clarifyingfast()semantics.
fast()currently returns the same asdefault(). If the intent is "no timing delays", the name is clear. If there's a planned distinction from default, consider documenting it.
112-114: Busy-poll pattern for draining receiver.The 1ms timeout loop drains incoming client messages. This works but could be simplified with
try_next()if available, or the loop could be removed if client messages don't need to be consumed during replay.
121-159: LGTM!Clean message conversion with proper error handling using
thiserror.
195-195: Sleep-based synchronization may be flaky.The 10ms sleep is a race condition mitigation, but under load the server might not be ready. Consider using a ready signal or retry logic in tests if flakiness occurs.
- Extract duplicated analytics setup into finalize_proxy_builder! macro - Extract encode_optional_binary helper for ping/pong methods - Remove redundant MockUpstreamConfig::fast() that just returned default() - Rename is_server_message/is_client_message to is_from_upstream/is_to_upstream for clarity - Add TEST_RESPONSE_TIMEOUT constant to replace magic number - Document single-connection limitation in mock server - Add tracing::warn for serialization errors in recording Co-Authored-By: yujonglee <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
crates/transcribe-proxy/tests/replay.rs (1)
86-89: Consider asserting that close frame was received for normal flows.Using
if let Some(...)means the close code assertion is silently skipped when no close frame arrives. For normal transcription tests, you may want to ensure a proper close frame is received.🔎 Optional: Assert close frame presence for normal flows
- if let Some((code, _reason)) = close_info { - assert_eq!(code, 1000, "Expected normal close code 1000"); - } + let (code, _reason) = close_info.expect("Expected close frame for normal transcription"); + assert_eq!(code, 1000, "Expected normal close code 1000");
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
crates/transcribe-proxy/src/routes/streaming.rs(4 hunks)crates/transcribe-proxy/tests/common/mock_upstream.rs(1 hunks)crates/transcribe-proxy/tests/common/recording.rs(1 hunks)crates/transcribe-proxy/tests/providers_e2e.rs(4 hunks)crates/transcribe-proxy/tests/replay.rs(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- crates/transcribe-proxy/tests/common/mock_upstream.rs
- crates/transcribe-proxy/tests/providers_e2e.rs
🧰 Additional context used
📓 Path-based instructions (2)
**/*
📄 CodeRabbit inference engine (AGENTS.md)
Format using
dprint fmtfrom the root. Do not usecargo fmt.
Files:
crates/transcribe-proxy/src/routes/streaming.rscrates/transcribe-proxy/tests/replay.rscrates/transcribe-proxy/tests/common/recording.rs
**/*.{ts,tsx,rs,js,jsx}
📄 CodeRabbit inference engine (AGENTS.md)
By default, avoid writing comments at all. If you write one, it should be about 'Why', not 'What'.
Files:
crates/transcribe-proxy/src/routes/streaming.rscrates/transcribe-proxy/tests/replay.rscrates/transcribe-proxy/tests/common/recording.rs
🧬 Code graph analysis (2)
crates/transcribe-proxy/src/routes/streaming.rs (3)
crates/transcribe-proxy/src/relay/builder.rs (3)
on_close(98-108)build(164-183)build(215-238)crates/transcribe-proxy/tests/common/mod.rs (1)
report_stt(29-37)crates/transcribe-proxy/src/upstream_url.rs (1)
build(28-50)
crates/transcribe-proxy/tests/common/recording.rs (2)
crates/transcribe-proxy/src/config.rs (1)
new(21-29)crates/transcribe-proxy/src/routes/mod.rs (1)
provider(23-25)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
- GitHub Check: Redirect rules - hyprnote
- GitHub Check: Redirect rules - hyprnote-storybook
- GitHub Check: Header rules - hyprnote-storybook
- GitHub Check: Header rules - hyprnote
- GitHub Check: Pages changed - hyprnote-storybook
- GitHub Check: Pages changed - hyprnote
- GitHub Check: fmt
- GitHub Check: desktop_ci (linux, depot-ubuntu-24.04-8)
- GitHub Check: desktop_ci (macos, depot-macos-14)
- GitHub Check: desktop_ci (linux, depot-ubuntu-22.04-8)
🔇 Additional comments (15)
crates/transcribe-proxy/src/routes/streaming.rs (3)
121-145: LGTM! Effective refactoring to eliminate duplication.The macro successfully centralizes analytics integration logic that was previously duplicated in both proxy builder functions. The implementation correctly handles the async closure, clones necessary values, and maintains consistent provider name formatting.
159-159: LGTM! Consistent macro usage.Both proxy builder functions now use the centralized macro, eliminating the previous inline analytics branches. This improves maintainability and consistency.
Also applies to: 175-175
41-60: Upstream URL feature is test-only and production-safe; no auth bypass risk.The custom upstream URL path in lines 41-60 only activates in test scenarios. In production code (apps/ai/src/main.rs),
SttProxyConfigis instantiated with only API keys, andwith_upstream_url()is never called. Theupstream_urlsHashMap remains empty, soupstream_url_for()always returnsNone, ensuring authentication logic executes normally in production. The feature is exclusively used for replay testing and mock server scenarios.crates/transcribe-proxy/tests/replay.rs (6)
17-29: LGTM!Clean helper function for establishing WebSocket connections with appropriate test error handling.
31-62: LGTM!Good design: discarding the timeout result allows collecting partial messages when the timeout expires, which is useful for tests that don't expect a clean close.
91-119: LGTM!Good test coverage for authentication error scenarios with appropriate flexibility for different close code representations.
121-202: LGTM!Good test coverage for rate limiting, Soniox normal transcription, and Soniox error scenarios. The tests appropriately check for provider-specific error messages and close codes.
204-231: LGTM!Good approach using the fixture's message count as the expected value to verify the proxy forwards all messages without loss.
233-258: LGTM!The test verifies that the proxy handles client disconnect without crashing. Using timing configuration ensures messages are still in-flight when disconnect occurs.
crates/transcribe-proxy/tests/common/recording.rs (6)
1-34: LGTM!Clean definitions with appropriate serde configuration. The tagged enum approach for
MessageKindmakes the JSONL format self-describing.
36-110: LGTM!Well-structured message representation with convenient constructors for each message type. The
decode_binaryreturning aResultproperly propagates base64 decode errors.
112-169: LGTM!Good JSONL handling with support for comments and empty lines. The
transformmethod provides useful flexibility for fixture manipulation.
171-205: LGTM!Clean recorder implementation. The
u128tou64cast for elapsed milliseconds is safe for any reasonable test duration.
207-239: LGTM!Thread-safe recording session with proper Arc/Mutex usage. The
.unwrap()on mutex locks is acceptable for test utilities since poisoned mutexes indicate prior panics.
241-273: LGTM!Good approach using
env!("CARGO_MANIFEST_DIR")for compile-time resolution of the fixtures directory. Environment variable handling is robust with multiple accepted values.
Summary
Adds replay testing infrastructure for the transcribe-proxy crate, enabling deterministic testing of WebSocket proxy behavior using recorded fixtures. This allows testing edge cases (auth errors, rate limits) without hitting live APIs.
Key additions:
WsRecording/WsMessagetypes for recording and replaying WebSocket message sequencesMockUpstreamServerthat replays recorded messages with optional timing simulationRECORD_FIXTURES=1)with_upstream_url)Code quality improvements:
finalize_proxy_builder!macroencode_optional_binaryhelper for ping/pong binary encodingis_server_message/is_client_messagetois_from_upstream/is_to_upstreamfor clarityTEST_RESPONSE_TIMEOUTconstant to replace magic numberstracing::warnfor serialization failures during recordingReview & Testing Checklist for Human
finalize_proxy_builder!macro correctly handles both analytics and non-analytics cases (streaming.rs lines 121-144)tests/fixtures/contain valid JSONL and match expected provider response formatscargo test -p transcribe-proxy --test replayto verify all replay tests passRecommended test plan:
cargo test -p transcribe-proxy --test replayRECORD_FIXTURES=1 FIXTURE_OUTPUT_DIR=./tests/fixtures cargo test -p transcribe-proxy --test providers_e2eNotes
The mock server intentionally only accepts one connection per instance for test isolation. Each test creates its own mock server.
Link to Devin run: https://app.devin.ai/sessions/7395b0d3df56462fbb7a5baeb1d64d67
Requested by: yujonglee ([email protected]) / @yujonglee