Skip to content

Conversation

@yujonglee
Copy link
Contributor

@yujonglee yujonglee commented Dec 4, 2025

fix(stt): improve readability and reliability in apps/api/src/stt

Summary

This PR improves the STT (speech-to-text) proxy module with better error handling, input validation, and code readability.

Reliability improvements:

  • Add try/catch for unhandled async error in WebSocket message handler to prevent process crashes
  • Add URL validation for upstream override (validates ws:/wss: protocol, prevents SSRF)
  • Add API key validation to Deepgram provider (now throws descriptive error if missing)
  • Add runtime validation for provider query parameter (throws on unknown providers)
  • Reset hasTransformedFirst flag in closeConnections to prevent stale state if connection is reused

Readability improvements:

  • Rename timeout constants with _MS suffix for clarity
  • Add comments explaining constants and magic numbers
  • Rename flush methods: flushPendingMessagesflushUpstreamQueue, flushDownstreamMessagesflushDownstreamQueue
  • Add comment explaining why binary payloads are cloned (buffer ownership)
  • Document race condition prevention in initializeUpstream

Updates since last revision

  • Fixed TypeScript error in provider type assignment where providerParam could be null when assigned to provider variable (required for CI to pass)

Review & Testing Checklist for Human

  • Provider validation behavior change: Previously, unknown provider query params silently fell back to deepgram. Now throws an error. Verify no existing clients pass invalid provider values.
  • Deepgram API key validation: Now throws if DEEPGRAM_API_KEY is missing. Previously would pass undefined to Authorization header. Verify this is acceptable.
  • URL protocol validation: Upstream override URLs must now use ws: or wss: protocol. Verify no legitimate use cases are broken.
  • Test the STT proxy end-to-end with each provider (deepgram, assemblyai, soniox) to verify the changes don't break normal operation.

Recommended test plan:

  1. Start a recording session using each STT provider and verify transcription works
  2. Test with an invalid provider query param and verify it returns an error (not silent fallback)
  3. Verify deployments have DEEPGRAM_API_KEY configured before merging

Notes

The WsProxyConnection class has no integration test coverage, so these changes rely on code review and manual testing. The existing unit tests for utility functions pass.

Skipped larger refactoring (grouping state variables, extracting common flush logic) as too risky without test coverage.

Link to Devin run: https://app.devin.ai/sessions/ed7a91a96e0a463e8617293fbadeb715
Requested by: yujonglee ([email protected]) (@yujonglee)

devin-ai-integration bot and others added 9 commits December 4, 2025 13:47
Wrap normalizeWsData call in try/catch to prevent unhandled promise
rejections from crashing the process. Errors are now captured to Sentry
and the connection is gracefully closed.

Co-Authored-By: yujonglee <[email protected]>
Validate that user-provided upstream URLs are valid and use ws: or wss:
protocol to prevent SSRF attacks and invalid URL errors. Provides clear
error messages for debugging.

Co-Authored-By: yujonglee <[email protected]>
Standardize error handling across all STT providers by adding explicit
API key validation to Deepgram. Now all three providers (Deepgram,
AssemblyAI, Soniox) throw descriptive errors when API keys are missing.

Co-Authored-By: yujonglee <[email protected]>
Add explicit validation for the provider query parameter instead of
silently falling back to deepgram for unknown providers. Now throws
a descriptive error listing valid providers when an invalid provider
is specified.

Co-Authored-By: yujonglee <[email protected]>
Add comment explaining why clientSocket must be set before
ensureUpstreamSocket() to prevent race conditions when the upstream
connection becomes ready during initialization.

Co-Authored-By: yujonglee <[email protected]>
Reset the hasTransformedFirst flag when closing connections to ensure
the transform is applied correctly if the connection instance is reused.
This prevents stale state from affecting subsequent connections.

Co-Authored-By: yujonglee <[email protected]>
Add explanatory comments for magic numbers and rename timeout constants
to include _MS suffix for clarity:
- UPSTREAM_ERROR_TIMEOUT -> UPSTREAM_ERROR_GRACE_MS
- UPSTREAM_CONNECT_TIMEOUT -> UPSTREAM_CONNECT_TIMEOUT_MS

Co-Authored-By: yujonglee <[email protected]>
Rename methods to better describe what they're flushing:
- flushPendingMessages -> flushUpstreamQueue
- flushDownstreamMessages -> flushDownstreamQueue

This makes it clearer that these methods flush queues in different
directions (to upstream server vs to downstream client).

Co-Authored-By: yujonglee <[email protected]>
Add comment explaining why binary payloads are cloned - WebSocket message
buffers may be reused or invalidated after the event handler returns, so
copying prevents use-after-free issues when payloads are queued.

Co-Authored-By: yujonglee <[email protected]>
@devin-ai-integration
Copy link
Contributor

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR that start with 'DevinAI' or '@devin'.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@netlify
Copy link

netlify bot commented Dec 4, 2025

Deploy Preview for hyprnote ready!

Name Link
🔨 Latest commit c71b72e
🔍 Latest deploy log https://app.netlify.com/projects/hyprnote/deploys/69319639c90ef400082d7fdc
😎 Deploy Preview https://deploy-preview-2118--hyprnote.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@netlify
Copy link

netlify bot commented Dec 4, 2025

Deploy Preview for hyprnote-storybook ready!

Name Link
🔨 Latest commit c71b72e
🔍 Latest deploy log https://app.netlify.com/projects/hyprnote-storybook/deploys/6931963948090700086b5797
😎 Deploy Preview https://deploy-preview-2118--hyprnote-storybook.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 4, 2025

📝 Walkthrough

Walkthrough

The PR refactors and enhances the STT proxy system with improved error handling, validation, and resource management. Changes include renaming queue methods, adding timing constants, validating provider configuration and upstream URLs, enforcing environment variable requirements, and implementing binary payload cloning to prevent use-after-free conditions.

Changes

Cohort / File(s) Summary
Connection layer refactoring
apps/api/src/stt/connection.ts
Introduced upstream timing constants (UPSTREAM_ERROR_GRACE_MS, UPSTREAM_CONNECT_TIMEOUT_MS), renamed queue methods (flushPendingMessages → flushUpstreamQueue, flushDownstreamMessages → flushDownstreamQueue), enhanced error handling with Sentry capture and explicit close reasons, improved initialization sequence to prevent race conditions, and reworked message transformation and forwarding logic
Provider configuration validation
apps/api/src/stt/deepgram.ts, apps/api/src/stt/index.ts
Added runtime DEEPGRAM_API_KEY environment variable validation with error on missing key; introduced VALID_PROVIDERS helper and provider parameter validation; added upstream URL parsing with protocol enforcement (ws:/wss:) and improved provider resolution logic
Utility enhancements
apps/api/src/stt/utils.ts
Introduced cloneBinaryPayload helper function to clone binary data (ArrayBuffer/ArrayBufferView) into new Uint8Array, preventing use-after-free when buffers are queued

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

  • apps/api/src/stt/connection.ts: Pay close attention to the initialization sequence (clientSocket setting before ensureUpstreamSocket) and race condition prevention; verify queue flushing conditions and timing are correct; review message transformation guard and first-message handling logic
  • apps/api/src/stt/deepgram.ts and apps/api/src/stt/index.ts: Confirm validation error messages are appropriate and validation logic is comprehensive (especially URL protocol checks)
  • apps/api/src/stt/utils.ts: Verify cloneBinaryPayload correctly handles all binary input types and that it integrates properly with queue operations

Possibly related PRs

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main objective of the changeset: improving readability and reliability in the STT module through better error handling, validation, and code refactoring.
Description check ✅ Passed The PR description clearly relates to the changeset, detailing reliability and readability improvements across STT modules with specific examples of changes made.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch devin/1764856000-stt-readability-reliability

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
apps/api/src/stt/utils.ts (1)

12-22: ArrayBuffer path still returns a non‑cloned view; use cloneBinaryPayload to fully avoid buffer reuse issues.

cloneBinaryPayload correctly deep‑copies ArrayBuffer/views, but normalizeWsData only uses it for Uint8Array and ArrayBuffer.isView(...). For plain ArrayBuffer you still do new Uint8Array(data), which creates a new view over the same underlying buffer rather than copying it. Since WsProxyConnection sets binaryType = "arraybuffer" for the upstream socket, event.data will typically be an ArrayBuffer, so queued payloads can still alias a buffer that the WebSocket implementation may later reuse.

To make the “ownership” guarantee in the comment accurate and avoid subtle corruption, route the ArrayBuffer case through cloneBinaryPayload as well:

  if (data instanceof Uint8Array) {
    return cloneBinaryPayload(data);
  }

  if (data instanceof ArrayBuffer) {
-   return new Uint8Array(data);
+   return cloneBinaryPayload(data);
  }

  if (ArrayBuffer.isView(data)) {
    return cloneBinaryPayload(data);
  }

Optionally, you could simplify further by having cloneBinaryPayload handle all non‑string/non‑Blob binary payloads and call it unconditionally for those.

Also applies to: 32-43

🧹 Nitpick comments (3)
apps/api/src/stt/index.ts (2)

17-25: Provider whitelist and type guard are aligned with SttProvider.

Defining VALID_PROVIDERS as readonly SttProvider[] plus isValidProvider gives you a single source of truth for allowed providers while keeping nice type‑level narrowing. This matches the SttProvider union and keeps the switch below exhaustive.


35-47: Upstream override URL parsing and protocol enforcement improve safety.

Wrapping new URL(upstreamOverride) in a try/catch and explicitly checking for ws: / wss: prevents invalid or non‑WebSocket URLs from being proxied, which is a good hardening step for this header‑driven override.

apps/api/src/stt/connection.ts (1)

11-18: Upstream lifecycle, error handling, and queue management refactor look solid.

  • New timing constants (UPSTREAM_ERROR_GRACE_MS, UPSTREAM_CONNECT_TIMEOUT_MS) and the grace‑period timeout around upstream errors are well‑scoped and cleared reliably.
  • Resetting hasTransformedFirst in closeConnections avoids stale state when reusing a WsProxyConnection.
  • Splitting queue flushing into flushUpstreamQueue and flushDownstreamQueue, and calling both on upstream "open" (and when upstream is already ready in initializeUpstream), fixes the race where the upstream could become ready before clientSocket was set and ensures queued messages are drained as soon as both ends are usable.
  • Wrapping normalizeWsData(event.data) in a try/catch with Sentry capture and closing on "message_normalize_failed" improves observability and avoids silent corruption when upstream sends malformed or unsupported payload types.
  • preconnectUpstream’s timeout handling is straightforward and cleans up the timer in all paths; errors correctly trigger closeConnections("upstream_connect_failed") before rethrowing.
  • First‑message transformation logic in sendToUpstream is now idempotent and only applied once, and control vs data messages are queued separately with a clear backpressure limit for the upstream side.

If you want to harden further in the future, you might consider adding a similar size budget for pendingDownstreamMessages to guard against a misbehaving upstream when the client is slow or disconnected, but that’s an optional enhancement—not a blocker for this PR.

Also applies to: 66-75, 188-189, 191-244, 269-279, 281-293, 326-365, 368-431

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 571e632 and b617445.

📒 Files selected for processing (4)
  • apps/api/src/stt/connection.ts (7 hunks)
  • apps/api/src/stt/deepgram.ts (1 hunks)
  • apps/api/src/stt/index.ts (3 hunks)
  • apps/api/src/stt/utils.ts (1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.ts

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.ts: Agent implementations should use TypeScript and follow the established architectural patterns defined in the agent framework
Agent communication should use defined message protocols and interfaces

Files:

  • apps/api/src/stt/deepgram.ts
  • apps/api/src/stt/utils.ts
  • apps/api/src/stt/index.ts
  • apps/api/src/stt/connection.ts
**/*.{ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

**/*.{ts,tsx}: Avoid creating a bunch of types/interfaces if they are not shared. Especially for function props, just inline them instead.
Never do manual state management for form/mutation. Use useForm (from tanstack-form) and useQuery/useMutation (from tanstack-query) instead for 99% of cases. Avoid patterns like setError.
If there are many classNames with conditional logic, use cn (import from @hypr/utils). It is similar to clsx. Always pass an array and split by logical grouping.
Use motion/react instead of framer-motion.

Files:

  • apps/api/src/stt/deepgram.ts
  • apps/api/src/stt/utils.ts
  • apps/api/src/stt/index.ts
  • apps/api/src/stt/connection.ts
🧬 Code graph analysis (1)
apps/api/src/stt/connection.ts (1)
apps/api/src/stt/utils.ts (1)
  • normalizeWsData (5-30)
🪛 GitHub Actions: .github/workflows/api_ci.yaml
apps/api/src/stt/index.ts

[error] 64-64: TypeScript error during typecheck. TS2322: Type 'string | null' is not assignable to type 'SttProvider'. Command failed: 'pnpm -F @hypr/api typecheck'.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: Redirect rules - hyprnote
  • GitHub Check: Header rules - hyprnote
  • GitHub Check: Pages changed - hyprnote
  • GitHub Check: fmt
  • GitHub Check: Devin
🔇 Additional comments (1)
apps/api/src/stt/deepgram.ts (1)

21-24: API key guard and cached header usage look good.

Validating env.DEEPGRAM_API_KEY up front and reusing the cached apiKey in the Authorization header improves reliability and avoids silent misconfiguration. Just make sure callers surface this error in a controlled way (e.g., returning a 5xx) rather than crashing the whole process.

Also applies to: 29-30

Fix type narrowing issue where providerParam could be null when assigned
to provider variable. Now properly checks providerParam is truthy before
using the type guard result.

Co-Authored-By: yujonglee <[email protected]>
@yujonglee yujonglee closed this Dec 4, 2025
@yujonglee yujonglee deleted the devin/1764856000-stt-readability-reliability branch December 4, 2025 23:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants