Conversation
OpenCTO Autonomous PR Review (2026-03-16T23:21:00.200Z)Decision: changes_requested The PR adds significant new functionality for Google live backend paths including a new FastAPI service, routing changes, and extensive documentation. However, the validation section in the PR description is incomplete, and the author did not perform live end-to-end testing using actual Google Cloud credentials, which raises concerns about the reliability and correctness of the implementation in a production environment. Concerns:
|
|
|
||
|
|
||
| class LiveSession(Protocol): | ||
| async def send_text(self, text: str) -> None: ... |
|
|
||
| class LiveSession(Protocol): | ||
| async def send_text(self, text: str) -> None: ... | ||
| async def send_audio(self, data: bytes, mime_type: str) -> None: ... |
| class LiveSession(Protocol): | ||
| async def send_text(self, text: str) -> None: ... | ||
| async def send_audio(self, data: bytes, mime_type: str) -> None: ... | ||
| async def send_video(self, data: bytes, mime_type: str) -> None: ... |
| async def send_text(self, text: str) -> None: ... | ||
| async def send_audio(self, data: bytes, mime_type: str) -> None: ... | ||
| async def send_video(self, data: bytes, mime_type: str) -> None: ... | ||
| async def send_tool_responses(self, responses: list[dict[str, Any]]) -> None: ... |
| async def send_audio(self, data: bytes, mime_type: str) -> None: ... | ||
| async def send_video(self, data: bytes, mime_type: str) -> None: ... | ||
| async def send_tool_responses(self, responses: list[dict[str, Any]]) -> None: ... | ||
| async def receive(self) -> AsyncIterator[Any]: ... |
|
|
||
|
|
||
| class LiveSessionContext(Protocol): | ||
| async def __aenter__(self) -> LiveSession: ... |
|
|
||
| class LiveSessionContext(Protocol): | ||
| async def __aenter__(self) -> LiveSession: ... | ||
| async def __aexit__(self, exc_type, exc, tb) -> None: ... |
|
|
||
|
|
||
| class LiveSessionFactory(Protocol): | ||
| def connect(self, model: str, setup_config: dict[str, Any]) -> LiveSessionContext: ... |
| if callable(close_fn): | ||
| result = close_fn() | ||
| if asyncio.iscoroutine(result): | ||
| await result |
Greptile SummaryThis PR adds a full Google Vertex AI Live (Gemini) realtime voice path alongside the existing OpenAI realtime path. It introduces a new FastAPI Python service ( Key findings:
Confidence Score: 3/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant Browser as Dashboard
participant Worker as API Worker
participant Python as Google Live Backend
participant Google as Google Vertex AI
Browser->>Worker: POST /api/v1/google-live/session
Worker->>Worker: Authenticate + rate limit (x2 bug)
Worker->>Worker: Select allowed model
Worker->>Worker: Sign short-lived token
Worker-->>Browser: wsUrl + signed token
Browser->>Python: WebSocket /ws/live (token in query)
Python->>Python: Verify token + expiry
Browser->>Python: setup frame (model, instructions, tools)
Python->>Google: live.connect(model, config)
Python-->>Browser: setupComplete
loop Realtime streaming
Browser->>Python: audio chunks
Python->>Google: send_realtime_input
Google-->>Python: audio + transcriptions
Python-->>Browser: serverContent frame
Google-->>Python: toolCall
Python-->>Browser: toolCall frame
Browser->>Python: toolResponse
Python->>Google: send_tool_response
end
Last reviewed commit: e6ec5f3 |
| if (path === '/api/v1/google-live/session' && method === 'POST') { | ||
| const body = await request.clone().json().catch(() => ({})) as { workspaceId?: string } | ||
| await enforceRateLimit(ctx, 'google_live_session', { | ||
| limit: parseRateLimit(ctx.env.RATE_LIMIT_GOOGLE_LIVE_SESSIONS_PER_MINUTE, DEFAULT_REALTIME_RATE_LIMIT_PER_MINUTE), | ||
| windowSeconds: 60, | ||
| workspaceId: body.workspaceId, | ||
| }) | ||
| return await googleLive.createGoogleLiveSession(request, ctx) | ||
| } |
There was a problem hiding this comment.
Double rate-limit enforcement per request
The route handler enforces google_live_session rate limiting here (lines 312–317), and then calls googleLive.createGoogleLiveSession → mintGoogleLiveSession, which enforces the same rate limit key a second time (see googleLive.ts lines 47–51).
Every request therefore consumes two credits against the same counter, effectively halving whatever limit is configured in RATE_LIMIT_GOOGLE_LIVE_SESSIONS_PER_MINUTE. For comparison, the realtime-token route (lines 300–308) only does rate-limiting in the router and mintRealtimeToken does not call enforceRateLimit internally — that is the correct pattern.
Remove the enforceRateLimit call from the router here, keeping only the one inside mintGoogleLiveSession, so that the limit logic remains self-contained in the feature module.
| if (path === '/api/v1/google-live/session' && method === 'POST') { | |
| const body = await request.clone().json().catch(() => ({})) as { workspaceId?: string } | |
| await enforceRateLimit(ctx, 'google_live_session', { | |
| limit: parseRateLimit(ctx.env.RATE_LIMIT_GOOGLE_LIVE_SESSIONS_PER_MINUTE, DEFAULT_REALTIME_RATE_LIMIT_PER_MINUTE), | |
| windowSeconds: 60, | |
| workspaceId: body.workspaceId, | |
| }) | |
| return await googleLive.createGoogleLiveSession(request, ctx) | |
| } | |
| if (path === '/api/v1/google-live/session' && method === 'POST') { | |
| return await googleLive.createGoogleLiveSession(request, ctx) | |
| } |
| fastapi | ||
| uvicorn[standard] | ||
| python-dotenv | ||
| pytest | ||
| httpx | ||
| google-genai |
There was a problem hiding this comment.
Unpinned dependency versions
All six packages are listed without version constraints. This makes builds non-reproducible — a later pip install can silently pull in a breaking release of google-genai, fastapi, or uvicorn.
Pin versions to the exact revisions that were tested, for example:
| fastapi | |
| uvicorn[standard] | |
| python-dotenv | |
| pytest | |
| httpx | |
| google-genai | |
| fastapi==0.115.12 | |
| uvicorn[standard]==0.34.0 | |
| python-dotenv==1.1.0 | |
| pytest==8.3.5 | |
| httpx==0.28.1 | |
| google-genai==1.10.0 |
| return jsonResponse({ | ||
| provider: 'google_vertex', | ||
| mode: 'vertex_live', | ||
| model: selectedModel, | ||
| wsUrl, | ||
| websocketUrl: wsUrl, | ||
| sessionToken, | ||
| workspaceId, | ||
| sessionId, | ||
| traceId: ctx.traceContext.traceId, | ||
| expiresAt: new Date(payload.exp * 1000).toISOString(), | ||
| }) | ||
| } |
There was a problem hiding this comment.
Redundant websocketUrl field in response
wsUrl and websocketUrl are always set to the exact same value (both assigned wsUrl on lines 82–83). The GoogleLiveSessionBootstrap interface in shared.ts only references wsUrl, and googleAdapter.ts only reads bootstrap.wsUrl. The websocketUrl alias is dead code that unnecessarily inflates the response payload.
| return jsonResponse({ | |
| provider: 'google_vertex', | |
| mode: 'vertex_live', | |
| model: selectedModel, | |
| wsUrl, | |
| websocketUrl: wsUrl, | |
| sessionToken, | |
| workspaceId, | |
| sessionId, | |
| traceId: ctx.traceContext.traceId, | |
| expiresAt: new Date(payload.exp * 1000).toISOString(), | |
| }) | |
| } | |
| return jsonResponse({ | |
| provider: 'google_vertex', | |
| mode: 'vertex_live', | |
| model: selectedModel, | |
| wsUrl, | |
| sessionToken, | |
| workspaceId, | |
| sessionId, | |
| traceId: ctx.traceContext.traceId, | |
| expiresAt: new Date(payload.exp * 1000).toISOString(), | |
| }) |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e6ec5f3700
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| await enforceRateLimit(ctx, 'google_live_session', { | ||
| limit: parseRateLimit(ctx.env.RATE_LIMIT_GOOGLE_LIVE_SESSIONS_PER_MINUTE, DEFAULT_REALTIME_RATE_LIMIT_PER_MINUTE), | ||
| windowSeconds: 60, | ||
| workspaceId: body.workspaceId, | ||
| }) |
There was a problem hiding this comment.
Remove duplicate Google Live session rate-limit enforcement
This route already increments the google_live_session bucket before dispatch, but mintGoogleLiveSession in opencto-api-worker/src/googleLive.ts enforces the same bucket again, and enforceRateLimit increments on every check. That means each /api/v1/google-live/session request consumes two quota units, so users hit 429 much earlier than configured (with current defaults, about 6 successful requests/minute instead of 12), which can break normal reconnect/bootstrap flows.
Useful? React with 👍 / 👎.
| voice: this.config.voice, | ||
| agentProfile: this.config.agentProfile ?? 'dispatch', |
There was a problem hiding this comment.
Send Gemini voice setting in the shape the backend reads
The setup payload now sends the selected voice as top-level setup.voice, but the new backend only derives voice from setup.generationConfig.speechConfig via extract_voice_name(generation_config) in opencto-google-live-backend/app.py. In the current flow this means speech_config is never set, so Gemini Live sessions ignore the user’s chosen voice and fall back to the provider default.
Useful? React with 👍 / 👎.
OpenCTO Autonomous PR Review (2026-03-17T00:42:53.726Z)Decision: changes_requested The PR adds a new Google live backend path with associated API worker services, documentation, and dashboard routing. While the implementation appears comprehensive and well-documented, the lack of end-to-end live Vertex AI session testing due to missing production Google Cloud credentials is a critical concern that needs to be addressed before approval. Concerns:
|
OpenCTO Autonomous PR Review (2026-03-17T00:53:03.788Z)Decision: changes_requested The PR introduces a new backend path for Google live sessions and related documentation and tests, but it lacks critical validation steps involving live end-to-end tests with Google Cloud credentials. The implementation is extensive, but the absence of real environment testing raises concerns about production readiness and integration robustness. Concerns:
|
OpenCTO Autonomous PR Review (2026-03-17T22:31:55.096Z)Decision: changes_requested The PR introduces a significant new backend path for Google live sessions with extensive documentation and code changes. However, there are concerns regarding validation and testing completeness that need to be addressed before approval. Concerns:
|
Summary
Validation
Notes