-
Notifications
You must be signed in to change notification settings - Fork 73
Add more metrics #143
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add more metrics #143
Conversation
WalkthroughAdds OpenTelemetry-based observability: tracer/meter and many metrics, a reusable Timer utility (context/decorator/callable) with tests, instruments core LLM/STT/TTS/TurnDetection flows to record latency/errors, updates plugins to the new _simple_response/_process_audio/detect_turn hooks, and provides an observability stack (Prometheus/Grafana/Jaeger) and dashboards. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant CoreLLM as LLM (core)
participant Timer
participant Meter as Meter(OTel)
participant Impl as LLM Impl (_simple_response)
Client->>CoreLLM: simple_response(text)
activate CoreLLM
CoreLLM->>Timer: start(llm_latency_ms, attrs)
activate Timer
CoreLLM->>Impl: _simple_response(text)
activate Impl
Impl-->>CoreLLM: LLMResponseEvent / raises
deactivate Impl
alt success
CoreLLM->>Timer: stop()
Timer->>Meter: record(elapsed_ms, attributes)
else error
CoreLLM->>Meter: llm_errors.add(1, {provider, error_type})
CoreLLM->>Timer: stop(exception=True)
Timer->>Meter: record(elapsed_ms, attributes_with_exception)
end
CoreLLM-->>Client: LLMResponseEvent or raise
deactivate CoreLLM
sequenceDiagram
participant Caller
participant CoreSTT as STT (core)
participant Timer
participant Meter as Meter(OTel)
participant Impl as STT Impl (_process_audio)
participant Counter as stt_errors
Caller->>CoreSTT: process_audio(pcm)
activate CoreSTT
CoreSTT->>Timer: start(stt_latency_ms, attrs)
activate Timer
CoreSTT->>Impl: _process_audio(pcm)
activate Impl
Impl-->>CoreSTT: success / raises
deactivate Impl
alt success
CoreSTT->>Timer: stop()
Timer->>Meter: record(elapsed_ms, attributes)
else error
CoreSTT->>Counter: add(1, {provider, error_type})
CoreSTT->>Timer: stop(exception=True)
Timer->>Meter: record(elapsed_ms, attributes_with_exception)
end
CoreSTT-->>Caller: returns / raises
deactivate CoreSTT
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Areas to focus review on:
Possibly related PRs
Suggested labels
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Disabled knowledge base sources:
📒 Files selected for processing (1)
✅ Files skipped from review due to trivial changes (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
plugins/krisp/vision_agents/plugins/krisp/turn_detection.py (1)
111-117: Fix type mismatch when dispatching PCM frames
process_pcm_turn_takingexpects aPcmDataobject (it readssample_rateandsamples), butdetect_turnpasses a rawndarray. The first call topcm.sample_ratewill throw anAttributeError, so turn detection never runs. Keep thePcmDatawrapper when resampling.- resampled_pcm = audio_data.resample(16_000, 1).samples + resampled_pcm = audio_data.resample(16_000, 1) try: loop = asyncio.get_event_loop() await loop.run_in_executor( - None, self.process_pcm_turn_taking, resampled_pcm, user_id, metadata + None, self.process_pcm_turn_taking, resampled_pcm, user_id, metadata )plugins/fish/vision_agents/plugins/fish/stt.py (1)
63-70: Align docstring with the actual signatureThe docstring still references
user_metadataand returning a list of tuples, but the method only takesparticipantand returnsNone. Please update the Args/Returns sections to reflect reality so the Google-style doc stays accurate.
🧹 Nitpick comments (3)
plugins/ultralytics/vision_agents/plugins/ultralytics/yolo_pose_processor.py (1)
32-35: Clean up duplicate default resolution assignments.Line 32 redefines
DEFAULT_WIDTH/DEFAULT_HEIGHTtwice, leaving both 640x480 and 1920x1080 literals in place. Please keep only the intended defaults so the module state is unambiguous.plugins/wizper/vision_agents/plugins/wizper/stt.py (1)
96-96: TODO indicates known service limitation.Line 96 documents a bug in the fal-ai/wizper service where only "transcribe" task works. Consider tracking this with an issue if not already done, as the
taskandtarget_languageconstructor parameters suggest broader intended functionality.Would you like me to create an issue to track this limitation and monitor for when the fal-ai/wizper service fixes the bug?
observability/grafana/init-home-dashboard.sh (1)
26-30: Hardcoded credentials acceptable for local development.The use of
admin:admincredentials is appropriate for the local Docker Compose development setup described in the PR. For production deployments, these credentials should be externalized via environment variables.Based on learnings
To make this more flexible, consider this enhancement:
+GRAFANA_USER="${GRAFANA_USER:-admin}" +GRAFANA_PASSWORD="${GRAFANA_PASSWORD:-admin}" + echo "" echo "Setting admin user home dashboard..." curl -X PUT \ - -u "admin:admin" \ + -u "${GRAFANA_USER}:${GRAFANA_PASSWORD}" \ -H "Content-Type: application/json" \
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (27)
agents-core/tests/test_timer.py(1 hunks)agents-core/vision_agents/core/llm/llm.py(9 hunks)agents-core/vision_agents/core/observability/metrics.py(2 hunks)agents-core/vision_agents/core/stt/stt.py(5 hunks)agents-core/vision_agents/core/tts/tts.py(1 hunks)agents-core/vision_agents/core/turn_detection/turn_detection.py(4 hunks)examples/01_simple_agent_example/simple_agent_example.py(5 hunks)observability/.gitignore(1 hunks)observability/README.md(1 hunks)observability/grafana/dashboards/vision-agents.json(1 hunks)observability/grafana/init-home-dashboard.sh(1 hunks)observability/grafana/provisioning/dashboards/default.yml(1 hunks)observability/grafana/provisioning/datasources/prometheus.yml(1 hunks)observability/prometheus/prometheus.yml(1 hunks)plugins/anthropic/vision_agents/plugins/anthropic/anthropic_llm.py(15 hunks)plugins/aws/vision_agents/plugins/aws/aws_llm.py(18 hunks)plugins/deepgram/vision_agents/plugins/deepgram/deepgram_stt.py(7 hunks)plugins/fish/vision_agents/plugins/fish/stt.py(2 hunks)plugins/gemini/vision_agents/plugins/gemini/gemini_llm.py(12 hunks)plugins/krisp/vision_agents/plugins/krisp/turn_detection.py(1 hunks)plugins/openai/vision_agents/plugins/openai/openai_llm.py(2 hunks)plugins/openrouter/vision_agents/plugins/openrouter/openrouter_llm.py(4 hunks)plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py(10 hunks)plugins/ultralytics/vision_agents/plugins/ultralytics/yolo_pose_processor.py(2 hunks)plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py(16 hunks)plugins/wizper/vision_agents/plugins/wizper/stt.py(1 hunks)plugins/xai/vision_agents/plugins/xai/llm.py(6 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (.cursor/rules/python.mdc)
**/*.py: Do not modify sys.path in Python code
Docstrings must follow the Google style guide
Files:
plugins/fish/vision_agents/plugins/fish/stt.pyplugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.pyplugins/wizper/vision_agents/plugins/wizper/stt.pyplugins/openrouter/vision_agents/plugins/openrouter/openrouter_llm.pyagents-core/vision_agents/core/tts/tts.pyplugins/xai/vision_agents/plugins/xai/llm.pyplugins/ultralytics/vision_agents/plugins/ultralytics/yolo_pose_processor.pyplugins/krisp/vision_agents/plugins/krisp/turn_detection.pyagents-core/tests/test_timer.pyagents-core/vision_agents/core/turn_detection/turn_detection.pyexamples/01_simple_agent_example/simple_agent_example.pyagents-core/vision_agents/core/stt/stt.pyplugins/gemini/vision_agents/plugins/gemini/gemini_llm.pyplugins/anthropic/vision_agents/plugins/anthropic/anthropic_llm.pyplugins/openai/vision_agents/plugins/openai/openai_llm.pyplugins/deepgram/vision_agents/plugins/deepgram/deepgram_stt.pyagents-core/vision_agents/core/observability/metrics.pyagents-core/vision_agents/core/llm/llm.pyplugins/aws/vision_agents/plugins/aws/aws_llm.pyplugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py
🧠 Learnings (1)
📚 Learning: 2025-10-13T22:00:34.300Z
Learnt from: dangusev
Repo: GetStream/Vision-Agents PR: 98
File: plugins/deepgram/vision_agents/plugins/deepgram/stt.py:135-150
Timestamp: 2025-10-13T22:00:34.300Z
Learning: In the Deepgram STT plugin (plugins/deepgram/vision_agents/plugins/deepgram/stt.py), the `started()` method is designed to wait for the connection attempt to complete, not to guarantee a successful connection. It's acceptable for the connection attempt to fail, and downstream code handles the case where `self.dg_connection` is `None`. The `_connected_once` event is set in the `finally` block intentionally to signal attempt completion.
Applied to files:
plugins/deepgram/vision_agents/plugins/deepgram/deepgram_stt.py
🧬 Code graph analysis (17)
plugins/fish/vision_agents/plugins/fish/stt.py (3)
agents-core/vision_agents/core/stt/stt.py (2)
_process_audio(151-166)_emit_transcript_event(41-63)plugins/deepgram/vision_agents/plugins/deepgram/deepgram_stt.py (1)
_process_audio(78-114)plugins/wizper/vision_agents/plugins/wizper/stt.py (1)
_process_audio(60-128)
plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (3)
agents-core/vision_agents/core/observability/metrics.py (1)
Timer(62-199)agents-core/vision_agents/core/turn_detection/turn_detection.py (1)
detect_turn(72-84)plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py (3)
detect_turn(182-193)predict_speech(518-546)_predict_turn_completed(405-456)
plugins/wizper/vision_agents/plugins/wizper/stt.py (3)
agents-core/vision_agents/core/stt/stt.py (1)
_process_audio(151-166)plugins/deepgram/vision_agents/plugins/deepgram/deepgram_stt.py (1)
_process_audio(78-114)plugins/fish/vision_agents/plugins/fish/stt.py (1)
_process_audio(52-138)
plugins/xai/vision_agents/plugins/xai/llm.py (4)
agents-core/vision_agents/core/llm/events.py (2)
LLMResponseChunkEvent(90-105)LLMResponseCompletedEvent(109-115)agents-core/vision_agents/core/llm/llm.py (2)
_simple_response(88-98)LLMResponseEvent(35-39)plugins/openai/vision_agents/plugins/openai/openai_llm.py (1)
_simple_response(81-109)plugins/xai/vision_agents/plugins/xai/events.py (1)
XAIChunkEvent(7-10)
plugins/ultralytics/vision_agents/plugins/ultralytics/yolo_pose_processor.py (1)
agents-core/vision_agents/core/observability/metrics.py (1)
Timer(62-199)
plugins/krisp/vision_agents/plugins/krisp/turn_detection.py (3)
agents-core/vision_agents/core/turn_detection/turn_detection.py (1)
detect_turn(72-84)plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (1)
detect_turn(158-170)plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py (1)
detect_turn(182-193)
agents-core/tests/test_timer.py (2)
agents-core/vision_agents/core/observability/metrics.py (4)
Timer(62-199)elapsed_ms(143-144)stop(125-141)_restart(146-150)agents-core/vision_agents/core/stt/stt.py (1)
process_audio(116-148)
agents-core/vision_agents/core/turn_detection/turn_detection.py (4)
agents-core/vision_agents/core/observability/metrics.py (1)
Timer(62-199)plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (1)
detect_turn(158-170)plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py (1)
detect_turn(182-193)plugins/krisp/vision_agents/plugins/krisp/turn_detection.py (1)
detect_turn(78-119)
examples/01_simple_agent_example/simple_agent_example.py (1)
plugins/krisp/vision_agents/plugins/krisp/turn_detection.py (1)
TurnDetection(34-202)
agents-core/vision_agents/core/stt/stt.py (4)
agents-core/vision_agents/core/observability/metrics.py (1)
Timer(62-199)agents-core/vision_agents/core/tts/tts.py (1)
send(238-343)agents-core/vision_agents/core/turn_detection/turn_detection.py (1)
process_audio(44-69)plugins/deepgram/vision_agents/plugins/deepgram/deepgram_stt.py (1)
_process_audio(78-114)
plugins/gemini/vision_agents/plugins/gemini/gemini_llm.py (6)
agents-core/vision_agents/core/llm/events.py (2)
LLMResponseCompletedEvent(109-115)LLMResponseChunkEvent(90-105)agents-core/vision_agents/core/llm/llm.py (5)
_simple_response(88-98)LLMResponseEvent(35-39)_dedup_and_execute(379-413)_convert_tools_to_provider_format(139-153)_extract_tool_calls_from_response(155-169)agents-core/vision_agents/core/agents/conversation.py (1)
Message(14-24)plugins/gemini/vision_agents/plugins/gemini/events.py (1)
GeminiResponseEvent(35-38)plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py (2)
_convert_tools_to_provider_format(509-532)_extract_tool_calls_from_response(534-569)agents-core/vision_agents/core/llm/llm_types.py (2)
ToolSchema(64-67)NormalizedToolCallItem(107-111)
plugins/anthropic/vision_agents/plugins/anthropic/anthropic_llm.py (3)
agents-core/vision_agents/core/llm/events.py (2)
LLMResponseChunkEvent(90-105)LLMResponseCompletedEvent(109-115)agents-core/vision_agents/core/llm/llm.py (7)
_simple_response(88-98)_dedup_and_execute(379-413)_sanitize_tool_output(415-426)_convert_tools_to_provider_format(139-153)_extract_tool_calls_from_response(155-169)LLMResponseEvent(35-39)_create_tool_result_message(187-202)agents-core/vision_agents/core/llm/llm_types.py (2)
NormalizedToolCallItem(107-111)ToolSchema(64-67)
plugins/openai/vision_agents/plugins/openai/openai_llm.py (5)
agents-core/vision_agents/core/llm/llm.py (1)
_simple_response(88-98)plugins/anthropic/vision_agents/plugins/anthropic/anthropic_llm.py (1)
_simple_response(75-95)plugins/aws/vision_agents/plugins/aws/aws_llm.py (1)
_simple_response(81-101)plugins/gemini/vision_agents/plugins/gemini/gemini_llm.py (1)
_simple_response(69-86)plugins/xai/vision_agents/plugins/xai/llm.py (1)
_simple_response(71-96)
plugins/deepgram/vision_agents/plugins/deepgram/deepgram_stt.py (1)
agents-core/vision_agents/core/stt/stt.py (1)
_process_audio(151-166)
agents-core/vision_agents/core/llm/llm.py (5)
agents-core/vision_agents/core/observability/metrics.py (1)
Timer(62-199)plugins/openai/vision_agents/plugins/openai/openai_llm.py (4)
_simple_response(81-109)_convert_tools_to_provider_format(372-404)_extract_tool_calls_from_response(406-433)_create_tool_result_message(435-465)agents-core/vision_agents/core/llm/llm_types.py (2)
ToolSchema(64-67)NormalizedToolCallItem(107-111)agents-core/vision_agents/core/llm/function_registry.py (1)
get_tool_schemas(108-114)agents-core/vision_agents/core/llm/events.py (2)
ToolStartEvent(119-125)ToolEndEvent(129-138)
plugins/aws/vision_agents/plugins/aws/aws_llm.py (5)
plugins/aws/tests/test_aws.py (1)
llm(37-44)agents-core/vision_agents/core/llm/events.py (2)
LLMResponseChunkEvent(90-105)LLMResponseCompletedEvent(109-115)agents-core/vision_agents/core/llm/llm.py (4)
_simple_response(88-98)LLMResponseEvent(35-39)_dedup_and_execute(379-413)_sanitize_tool_output(415-426)agents-core/vision_agents/core/llm/llm_types.py (1)
NormalizedToolCallItem(107-111)plugins/aws/vision_agents/plugins/aws/events.py (1)
AWSStreamEvent(7-10)
plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py (2)
agents-core/vision_agents/core/observability/metrics.py (1)
Timer(62-199)plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (4)
_prepare_silero_vad(150-156)SileroVAD(419-486)detect_turn(158-170)predict_speech(454-455)
🪛 Gitleaks (8.28.0)
observability/grafana/init-home-dashboard.sh
[high] 26-27: Discovered a potential basic authorization token provided in a curl command, which could compromise the curl accessed resource.
(curl-auth-user)
🪛 markdownlint-cli2 (0.18.1)
observability/README.md
19-19: Bare URL used
(MD034, no-bare-urls)
20-20: Bare URL used
(MD034, no-bare-urls)
21-21: Bare URL used
(MD034, no-bare-urls)
38-38: Bare URL used
(MD034, no-bare-urls)
51-51: Bare URL used
(MD034, no-bare-urls)
59-59: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
69-69: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
137-137: Bare URL used
(MD034, no-bare-urls)
138-138: Bare URL used
(MD034, no-bare-urls)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: unit / Ruff & mypy
- GitHub Check: unit / Test "not integration"
🔇 Additional comments (15)
plugins/deepgram/vision_agents/plugins/deepgram/deepgram_stt.py (1)
78-115: LGTM! Method rename aligns with observability pattern.The rename from
process_audioto_process_audiois consistent with the broader refactoring to support metrics collection at the base class level. The docstring follows Google style, and the logic correctly handles connection readiness, resampling, and audio transmission.plugins/anthropic/vision_agents/plugins/anthropic/anthropic_llm.py (4)
68-68: Provider name attribute added correctly.The addition of
provider_name = "anthropic"enables consistent provider identification in metrics and events across the observability stack.
75-95: LGTM! Method visibility change aligns with base class pattern.The rename to
_simple_responseis consistent with the standardized approach where the base class provides a public wrapper with metrics collection, delegating to this private implementation method.
132-217: Tool calling logic is well-structured.The multi-hop tool execution with deduplication (via
_dedup_and_execute) and bounded iteration (MAX_ROUNDS=3) provides robust handling of Claude's tool use pattern. The formatting improvements enhance readability without altering behavior.
218-349: Streaming tool execution correctly implemented.The streaming path maintains consistency with the non-streaming approach, properly accumulating tool calls across chunks and executing them in multi-hop fashion. Event emissions correctly identify the provider.
plugins/xai/vision_agents/plugins/xai/llm.py (2)
62-62: LGTM! Provider standardization implemented correctly.The addition of
provider_name = "xai"and the rename to_simple_responsealign with the cross-provider standardization pattern. The unusedparticipantparameter in the method signature matches the base class interface requirements.Also applies to: 71-96
148-152: Event emissions correctly formatted.The multi-line event emission formatting improves readability, and all events consistently include
plugin_name="xai"for proper observability tracking.Also applies to: 184-193, 199-205
observability/grafana/provisioning/dashboards/default.yml (2)
8-8: Verify dashboard deletion setting aligns with PR guidance.Line 8 sets
disableDeletion: false, allowing dashboards to be deleted via the UI. However, the PR description states "Do not modify Grafana directly; changes should be requested via AI." Consider setting this totrueto prevent accidental deletions and enforce the file-based workflow.If the intention is to allow UI modifications only for experimentation (not persistence), apply this diff:
- disableDeletion: false + disableDeletion: true
1-13: Provisioning configuration is well-structured.The dashboard provider configuration correctly points to the standard Grafana dashboards directory with folder structure preservation enabled. The 10-second update interval provides quick feedback during development.
plugins/wizper/vision_agents/plugins/wizper/stt.py (1)
60-129: LGTM! Implementation follows STT plugin pattern.The rename to
_process_audiois consistent with the base class design. The implementation properly handles temporary file lifecycle with cleanup in afinallyblock, and error handling emits appropriate events.examples/01_simple_agent_example/simple_agent_example.py (2)
60-94: Observability stack integration is well-implemented.The telemetry setup correctly configures OpenTelemetry with Prometheus metrics exposition on port 9464 and OTLP tracing to localhost:4317. The resource identification and graceful shutdown handling via
atexitare appropriate.
6-6: Verification complete: Migration fromvogent.TurnDetectiontosmart_turn.TurnDetectionis successful.The verification confirms no references to
vogent.TurnDetectionremain in the codebase. The file has been correctly updated to import and usesmart_turn.TurnDetection()on line 25, with the corresponding import statement on line 6. This aligns with the intended module organization where VogentTurnDetection remains available as a separate plugin while SmartTurnDetection becomes the default in examples.agents-core/vision_agents/core/tts/tts.py (1)
341-343: Improved metric identification with fully qualified class paths.The change from
self.__class__.__name__to the fully qualified path (module.qualname) provides better metric disambiguation when multiple TTS providers might share simple class names. This aligns with the observability enhancements throughout the codebase.observability/grafana/init-home-dashboard.sh (1)
1-33: Script correctly configures home dashboard.The initialization script properly waits for Grafana health, then sets the Vision Agents dashboard as the default home view for both organization and admin user. The dashboard UID reference aligns with the provisioned dashboard configuration.
agents-core/tests/test_timer.py (1)
18-440: Comprehensive Timer coverageLove how these tests hammer the context-manager, decorator, inheritance, and async paths—this suite should catch regressions fast.
plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py
Show resolved
Hide resolved
plugins/ultralytics/vision_agents/plugins/ultralytics/yolo_pose_processor.py
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (6)
plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py (2)
131-146: Consider warning callers when ignored parameters are used.The method accepts
system_prompt,temperature, andmax_tokensbut silently ignores them. While this appears to be for interface compliance (matching the OpenAI Realtime pattern), callers who pass these parameters may expect them to have an effect.Consider logging a warning when these parameters are provided:
async def _simple_response( self, text: str, system_prompt: Optional[str] = None, temperature: float = 0.7, max_tokens: Optional[int] = None, **kwargs: Any, ) -> str: """ Internal simple response implementation required by LLM base class. Note: Gemini Realtime is event-driven and doesn't return responses directly. This implementation sends the text and returns a placeholder. """ + if system_prompt or max_tokens or temperature != 0.7: + self.logger.warning( + "Parameters system_prompt, temperature, and max_tokens are not supported " + "by Gemini Realtime and will be ignored" + ) await self.send_realtime_input(text=text) return "" # Realtime API doesn't return text synchronously
148-164: Consider adding parameter warnings and clarifying the generator pattern.Similar to
_simple_response, this method silently ignoressystem_prompt,temperature, andmax_tokens. Additionally, thereturnfollowed byyieldpattern (lines 163-164) creates an empty generator, which is technically correct but may be confusing.Consider adding a warning for ignored parameters and a comment explaining the pattern:
async def _simple_response_stream( self, text: str, system_prompt: Optional[str] = None, temperature: float = 0.7, max_tokens: Optional[int] = None, **kwargs: Any, ): """ Internal simple response stream implementation required by LLM base class. Note: Gemini Realtime is event-driven and doesn't stream responses in this manner. This implementation sends the text but yields nothing. """ + if system_prompt or max_tokens or temperature != 0.7: + self.logger.warning( + "Parameters system_prompt, temperature, and max_tokens are not supported " + "by Gemini Realtime and will be ignored" + ) await self.send_realtime_input(text=text) + # Return immediately to create an empty generator (yield makes this a generator function) return yield # Make this a generatorplugins/openai/vision_agents/plugins/openai/openai_realtime.py (1)
138-154: Unreachable yield statement after return.The
yieldon line 154 is unreachable because it follows thereturnstatement on line 153. While this still makes the function a generator (which is the intent), the unreachable code is a code smell.Apply this diff to make the intent clearer:
- await self.rtc.send_text(text) - return - yield # Make this a generator + await self.rtc.send_text(text) + # Make this a generator that yields nothing + if False: + yieldOr alternatively:
- await self.rtc.send_text(text) - return - yield # Make this a generator + await self.rtc.send_text(text) + yield # Generator that yields nothingNote: This pattern is also present in the Gemini Realtime implementation (
plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py, lines 147-163).docker-compose.yml (3)
17-17: Pin Grafana and Prometheus to specific versions for reproducibility.Using
:latesttags can lead to non-deterministic behavior across environments and time, potentially breaking observability setups unexpectedly. Pin these images to specific versions for consistent deployments.- image: prom/prometheus:latest + image: prom/prometheus:v2.52.0- image: grafana/grafana:latest + image: grafana/grafana:11.2.0Check Prometheus releases and Grafana releases for appropriate versions to match your requirements.
Also applies to: 35-35
50-51: Add health checks to prevent race conditions during startup.The
depends_ondeclarations ensure service startup order but do not wait for readiness. Thegrafana-initservice may attempt to initialize dashboards before Grafana is fully ready, and Grafana may start before Prometheus is accepting connections.Consider adding health checks to ensure dependent services are truly ready before proceeding:
prometheus: image: prom/prometheus:v2.52.0 ... + healthcheck: + test: ["CMD", "wget", "--spider", "-q", "http://localhost:9090/-/healthy"] + interval: 5s + timeout: 3s + retries: 5grafana: ... depends_on: - - prometheus + prometheus: + condition: service_healthy + healthcheck: + test: ["CMD", "wget", "--spider", "-q", "http://localhost:3000/api/health"] + interval: 5s + timeout: 3s + retries: 5grafana-init: ... depends_on: - - grafana + grafana: + condition: service_healthyAlso applies to: 62-63
56-66: Grafana-init script robustness.The
grafana-initservice uses a simple curl image to run a shell script. Consider adding error handling and explicit success/failure indicators in the script itself to make debugging easier if initialization fails.Additionally, ensure the
init-home-dashboard.shscript includes proper error handling and exits with appropriate status codes.Please review the
observability/grafana/init-home-dashboard.shscript and confirm it:
- Exits with a non-zero status on failure
- Has informative logging for debugging
- Includes reasonable retry logic or timeout handling for API calls
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (4)
docker-compose.yml(1 hunks)plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py(1 hunks)plugins/openai/vision_agents/plugins/openai/openai_realtime.py(1 hunks)tests/test_function_calling.py(10 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
📄 CodeRabbit inference engine (.cursor/rules/python.mdc)
**/*.py: Do not modify sys.path in Python code
Docstrings must follow the Google style guide
Files:
plugins/openai/vision_agents/plugins/openai/openai_realtime.pytests/test_function_calling.pyplugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py
tests/**/*.py
📄 CodeRabbit inference engine (.cursor/rules/python.mdc)
tests/**/*.py: Never use mocking utilities (e.g., unittest.mock, pytest-mock) in test files
Write tests using pytest (avoid unittest.TestCase or other frameworks)
Mark integration tests with @pytest.mark.integration
Do not use @pytest.mark.asyncio; async support is automatic
Files:
tests/test_function_calling.py
🧬 Code graph analysis (3)
plugins/openai/vision_agents/plugins/openai/openai_realtime.py (3)
plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py (2)
_simple_response(131-146)_simple_response_stream(148-164)plugins/openai/vision_agents/plugins/openai/openai_llm.py (1)
_simple_response(81-109)plugins/openai/vision_agents/plugins/openai/rtc_manager.py (1)
send_text(419-440)
tests/test_function_calling.py (3)
agents-core/vision_agents/core/llm/llm.py (1)
LLM(46-426)plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py (2)
_simple_response(131-146)_simple_response_stream(148-164)plugins/openai/vision_agents/plugins/openai/openai_realtime.py (2)
_simple_response(121-136)_simple_response_stream(138-154)
plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py (2)
plugins/openai/vision_agents/plugins/openai/openai_realtime.py (2)
_simple_response(121-136)_simple_response_stream(138-154)plugins/gemini/vision_agents/plugins/gemini/gemini_llm.py (1)
_simple_response(69-86)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: unit / Test "not integration"
🔇 Additional comments (4)
plugins/openai/vision_agents/plugins/openai/openai_realtime.py (1)
121-136: Implementation correctly aligns with base LLM API.The method properly delegates to
rtc.send_textand returns a placeholder empty string, consistent with the event-driven nature of the Realtime API. The unused parameters (system_prompt,temperature,max_tokens) are accepted for interface compatibility.tests/test_function_calling.py (2)
16-41: Well-structured test LLM implementation.The TestLLM class provides appropriate mock implementations of the required abstract methods. The
_simple_responsereturns a mock string, and_simple_response_streamproperly yields (not unreachable like in the production code).
162-162: Correctly replaces abstract class instantiation with test implementation.Replacing
LLM()withTestLLM()throughout the tests properly addresses the issue of instantiating an abstract base class. The test implementation provides the required concrete methods.Also applies to: 176-176, 392-392, 416-416, 435-435, 481-481, 511-511, 544-544
docker-compose.yml (1)
22-22: All referenced configuration files are present and correctly mounted.Verification confirms that all observability configuration files and scripts referenced in the docker-compose.yml file exist in the repository:
observability/prometheus/prometheus.yml✓observability/grafana/init-home-dashboard.sh✓observability/grafana/provisioning/✓observability/grafana/dashboards/✓No issues found.
…m.py Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (1)
plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (1)
272-278: Replace magic numbers with defined constantsThe calculation uses literals
512and16000, butCHUNKandRATEconstants are defined at lines 41-42 and used elsewhere in the file (e.g., lines 238, 242, 243).Apply this diff:
trailing_silence_ms = ( self._silence.trailing_silence_chunks - * 512 - / 16000 + * CHUNK + / RATE * 1000 * 5 # DTX correction )
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (5)
plugins/anthropic/vision_agents/plugins/anthropic/anthropic_llm.py(15 hunks)plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py(2 hunks)plugins/openai/vision_agents/plugins/openai/openai_realtime.py(1 hunks)plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py(10 hunks)tests/test_function_calling.py(7 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- plugins/anthropic/vision_agents/plugins/anthropic/anthropic_llm.py
- plugins/openai/vision_agents/plugins/openai/openai_realtime.py
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
📄 CodeRabbit inference engine (.cursor/rules/python.mdc)
**/*.py: Do not modify sys.path in Python code
Docstrings must follow the Google style guide
Files:
plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.pyplugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.pytests/test_function_calling.py
tests/**/*.py
📄 CodeRabbit inference engine (.cursor/rules/python.mdc)
tests/**/*.py: Never use mocking utilities (e.g., unittest.mock, pytest-mock) in test files
Write tests using pytest (avoid unittest.TestCase or other frameworks)
Mark integration tests with @pytest.mark.integration
Do not use @pytest.mark.asyncio; async support is automatic
Files:
tests/test_function_calling.py
🧬 Code graph analysis (3)
plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py (4)
plugins/openai/vision_agents/plugins/openai/openai_realtime.py (2)
_simple_response(121-137)simple_response(100-119)plugins/gemini/vision_agents/plugins/gemini/gemini_llm.py (1)
_simple_response(69-86)agents-core/vision_agents/core/llm/llm.py (1)
LLMResponseEvent(35-39)agents-core/vision_agents/core/agents/agents.py (1)
simple_response(266-279)
plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (3)
agents-core/vision_agents/core/observability/metrics.py (1)
Timer(62-199)agents-core/vision_agents/core/turn_detection/turn_detection.py (1)
detect_turn(72-84)plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py (4)
detect_turn(182-193)_process_audio_packet(216-342)predict_speech(518-546)_predict_turn_completed(405-456)
tests/test_function_calling.py (4)
agents-core/vision_agents/core/llm/llm.py (6)
LLM(46-426)LLMResponseEvent(35-39)register_function(218-231)get_available_functions(233-235)_dedup_and_execute(379-413)_run_one_tool(280-352)agents-core/vision_agents/core/llm/function_registry.py (3)
FunctionRegistry(35-237)get_tool_schemas(108-114)get_callable(144-160)agents-core/vision_agents/core/llm/events.py (2)
ToolStartEvent(119-125)ToolEndEvent(129-138)agents-core/vision_agents/core/events/manager.py (1)
wait(470-484)
🔇 Additional comments (7)
tests/test_function_calling.py (1)
16-30: LGTM: Clean test double implementation.The
TestLLMclass provides a concrete implementation of the now-abstractLLMbase class, which is necessary after_simple_responsebecame abstract. This is a legitimate test double that doesn't rely on mocking frameworks.plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py (2)
131-147: LGTM! Internal method aligns with Realtime API pattern.The
_simple_responseimplementation correctly follows the pattern established inopenai_realtime.py. Returning an emptyLLMResponseEventis appropriate for event-driven APIs where actual responses arrive through the event system rather than return values.The inline import of
LLMResponseEventmirrors the OpenAI implementation and likely avoids circular dependencies.
362-364: LGTM! Formatting improvement.The multi-line formatting improves readability without changing behavior.
plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (4)
16-20: LGTM: Observability importsThe metrics imports are clean and consistent with the broader instrumentation patterns across the codebase.
177-208: Excellent lifecycle management with proper cleanupThe nested try/finally structure elegantly resolves the shutdown race condition from the previous review. The outer finally block (lines 205-208) ensures both
_processing_activeand_shutdown_eventare cleared even when cancellation occurs mid-processing, preventing hung waits and enabling clean restart cycles.
252-256: LGTM: VAD latency instrumentationThe Timer wrapping captures essential context (samples, implementation) and follows the established observability pattern seen in vogent.
294-303: LGTM: Comprehensive turn-end detection metricsThe Timer captures rich context (audio duration, samples, trailing silence, prediction probability, and turn-end decision), enabling effective observability for turn detection behavior.
| """ | ||
|
|
||
| import pytest | ||
| from unittest.mock import Mock, patch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove unittest.mock usage per coding guidelines.
This file violates the coding guideline: "Never use mocking utilities (e.g., unittest.mock, pytest-mock) in test files" for tests/**/*.py. The Mock and patch imports are used extensively in TestOpenAIFunctionCalling, TestClaudeFunctionCalling, and TestGeminiFunctionCalling classes.
Consider rewriting these tests without mocking frameworks—use real instances with test configurations, in-memory fakes, or integration test markers if external dependencies are required.
As per coding guidelines
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py (1)
182-193: Update call sites in vogent plugin to usedetect_turnmethod.The refactoring is incomplete. Tests and examples within the vogent plugin still call the old
process_audiomethod, which will cause failures:
plugins/vogent/tests/test_vogent_td.pylines 45, 46plugins/vogent/example/basic_usage.pylines 90, 166
🧹 Nitpick comments (4)
plugins/aws/vision_agents/plugins/aws/aws_llm.py (1)
164-168: Consider extracting duplicated constants.The
max_concurrency=8andtimeout_s=30parameters appear identically in both tool execution loops (lines 164-168 and 408-412).Apply this diff to extract the constants:
def __init__( self, model: str, region_name: str = "us-east-1", aws_access_key_id: Optional[str] = None, aws_secret_access_key: Optional[str] = None, aws_session_token: Optional[str] = None, ): """ Initialize the BedrockLLM class. Args: model: The Bedrock model ID (e.g., "anthropic.claude-3-5-sonnet-20241022-v2:0") region_name: AWS region name (default: "us-east-1") aws_access_key_id: Optional AWS access key ID aws_secret_access_key: Optional AWS secret access key aws_session_token: Optional AWS session token """ super().__init__() self.events.register_events_from_module(events) self.model = model self._pending_tool_uses_by_index: Dict[int, Dict[str, Any]] = {} + self._tool_max_concurrency = 8 + self._tool_timeout_s = 30Then reference these constants in both locations:
while current_calls and rounds < MAX_ROUNDS: # Execute calls concurrently with dedup triples, seen = await self._dedup_and_execute( cast(List[Dict[str, Any]], current_calls), seen=seen, - max_concurrency=8, - timeout_s=30, + max_concurrency=self._tool_max_concurrency, + timeout_s=self._tool_timeout_s, )And similarly at lines 408-412.
Also applies to: 408-412
plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py (3)
149-153: Simplify model initialization to match smart_turn pattern.The lambda wrappers around model constructors add unnecessary complexity. The
asyncio.to_threadfunction accepts any callable, so you can pass constructors directly as shown in the smart_turn plugin.Apply this diff to simplify
_prepare_silero_vad:- self.vad = await asyncio.to_thread( # type: ignore[func-returns-value] - lambda: SileroVAD( # type: ignore[arg-type] - path, reset_interval_seconds=self.vad_reset_interval_seconds - ) + self.vad = await asyncio.to_thread( + SileroVAD, path, reset_interval_seconds=self.vad_reset_interval_seconds )Apply similar changes to
_prepare_whisper:- self.whisper = await asyncio.to_thread( # type: ignore[func-returns-value] - lambda: WhisperModel( - self.whisper_model_size, device="cpu", compute_type="int8" - ) + self.whisper = await asyncio.to_thread( + WhisperModel, self.whisper_model_size, device="cpu", compute_type="int8" )And to
_prepare_vogent:- self.vogent = await asyncio.to_thread( # type: ignore[func-returns-value] - lambda: VogentDetector( - compile_model=True, - warmup=True, - device=None, - model_name="vogent/Vogent-Turn-80M", - revision="main", - ) + self.vogent = await asyncio.to_thread( + VogentDetector, + compile_model=True, + warmup=True, + device=None, + model_name="vogent/Vogent-Turn-80M", + revision="main", )Note: The
type: ignorecomments may become unnecessary after this refactor. If they remain, investigate the underlying type issues rather than suppressing them.Also applies to: 159-162, 171-179
376-403: Consider checking self.whisper before starting the timer.The Timer currently records latency even when
self.whisper is Noneand an empty string is returned immediately. This could skew your metrics if the early-exit path is taken frequently.Apply this diff to avoid recording no-op latency:
async def _transcribe_segment(self, pcm: PcmData) -> str: """ Transcribe audio segment using faster-whisper. Args: pcm: PcmData containing audio samples Returns: Transcribed text """ + if self.whisper is None: + return "" + with Timer(vogent_whisper_latency_ms) as timer: # Ensure it's 16khz and f32 format pcm = pcm.resample(16000).to_float32() audio_array = pcm.samples timer.attributes["audio_duration_ms"] = pcm.duration_ms timer.attributes["samples"] = len(audio_array) - if self.whisper is None: - return "" - # Run transcription in thread pool to avoid blocking segments, info = await asyncio.to_thread(
422-456: Consider checking self.vogent before starting the timer.Similar to the transcription method, the Timer records latency even when
self.vogent is NoneandFalseis returned immediately. Moving the guard clause outside the Timer context would provide more accurate metrics.Apply this diff:
async def _predict_turn_completed( self, pcm: PcmData, prev_line: str, curr_line: str, ) -> bool: """ Predict whether the current turn is complete using Vogent. Args: pcm: PcmData containing audio samples prev_line: Previous speaker's text (for context) curr_line: Current speaker's text Returns: True if turn is complete, False otherwise """ + if self.vogent is None: + return False + with Timer(turn_end_detection_latency_ms) as timer: # Ensure it's 16khz and f32 format pcm = pcm.resample(16000).to_float32() # Truncate to 8 seconds audio_array = pcm.tail(8, False).samples timer.attributes["implementation"] = "vogent" timer.attributes["audio_duration_ms"] = len(audio_array) / 16000 * 1000 timer.attributes["prev_line_length"] = len(prev_line) timer.attributes["curr_line_length"] = len(curr_line) - if self.vogent is None: - return False - # Run vogent prediction in thread pool
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (3)
plugins/aws/vision_agents/plugins/aws/aws_llm.py(3 hunks)plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py(10 hunks)plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py(15 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (.cursor/rules/python.mdc)
**/*.py: Do not modify sys.path in Python code
Docstrings must follow the Google style guide
Files:
plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.pyplugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.pyplugins/aws/vision_agents/plugins/aws/aws_llm.py
🧬 Code graph analysis (3)
plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (2)
agents-core/vision_agents/core/observability/metrics.py (1)
Timer(62-199)plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py (4)
detect_turn(182-193)_process_audio_packet(216-342)predict_speech(518-546)_predict_turn_completed(405-456)
plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py (3)
agents-core/vision_agents/core/observability/metrics.py (1)
Timer(62-199)agents-core/vision_agents/core/utils/utils.py (1)
ensure_model(115-157)plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (3)
_prepare_silero_vad(151-157)SileroVAD(429-496)detect_turn(159-171)
plugins/aws/vision_agents/plugins/aws/aws_llm.py (2)
plugins/gemini/vision_agents/plugins/gemini/gemini_llm.py (1)
_simple_response(69-86)agents-core/vision_agents/core/llm/llm.py (2)
_simple_response(88-98)LLMResponseEvent(35-39)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: unit / Ruff & mypy
- GitHub Check: unit / Test "not integration"
- GitHub Check: unit / Ruff & mypy
- GitHub Check: unit / Test "not integration"
🔇 Additional comments (6)
plugins/aws/vision_agents/plugins/aws/aws_llm.py (1)
82-87: LGTM - Method signature aligns with the refactoring pattern.The change from public
simple_responseto private_simple_responsewith explicit return type annotation is consistent with the broader refactoring across LLM implementations. The signature correctly matches the base class abstract method.plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py (3)
19-24: LGTM! Clean metrics imports and definition.The observability imports are well-organized, and the new
vogent_whisper_latency_mshistogram follows proper naming conventions with clear metadata.Also applies to: 31-36
259-264: LGTM! VAD instrumentation follows best practices.The Timer usage correctly wraps the VAD prediction, and the attributes provide useful context for performance analysis (sample count and implementation type).
476-476: LGTM! Punctuation stripping improves model input.Using
rstrip(".!?")to remove terminal punctuation before passing text to Vogent is a sensible preprocessing step that should improve turn detection accuracy.plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (2)
178-210: Lifecycle cleanup looks solid.Resetting
_processing_activeand_shutdown_eventin the shutdown path prevents the cancellation hang we saw earlier and keeps the detector restartable.
253-305: Telemetry instrumentation is on point.Wrapping the VAD and turn-completion calls in
Timergives us per-call latency with helpful context attributes, which will drop straight into the new dashboards.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
plugins/openai/vision_agents/plugins/openai/openai_realtime.py (1)
108-127: Overriding the public method bypasses all metrics tracking.This override completely defeats the observability instrumentation introduced in this PR. The base class
simple_response(lines 72-94 inllm.py) wraps_simple_responsewith histogram tracking forllm_latency_msand error attribution. By overriding the public method here, external calls bypass that wrapper entirely—no latency is recorded, no provider attribute is set, and exceptions are not tagged.Additionally, the method lacks a return type annotation and implicitly returns
None, violating the base class signature-> LLMResponseEvent[Any].Solution:
Remove this override entirely and consolidate the implementation into_simple_responsebelow. Let the base class handle the public entry point so metrics are captured for every call.Apply this diff:
- async def simple_response( - self, - text: str, - processors: Optional[List[Processor]] = None, - participant: Optional[Participant] = None, - ): - """Send a simple text input to the OpenAI Realtime session. - - This is a convenience wrapper that forwards a text prompt upstream via - the underlying realtime connection. It does not stream partial deltas - back; callers should subscribe to the provider's events to receive - responses. - - Args: - text: Text prompt to send. - processors: Optional processors list (not used here; included for - interface parity with the core `LLM` API). - participant: Optional participant metadata (ignored here). - """ - await self.rtc.send_text(text) - async def _simple_response( self, text: str, processors: Optional[List[Processor]] = None, participant: Optional[Participant] = None, ): - """ - Internal simple response implementation required by LLM base class. - - Note: OpenAI Realtime is event-driven and doesn't return responses directly. - This implementation sends the text via the public simple_response method. - """ + """Send a simple text input to the OpenAI Realtime session. + + Internal implementation required by LLM base class. OpenAI Realtime is + event-driven and doesn't return responses directly; callers should + subscribe to the provider's events to receive responses. + + Args: + text: Text prompt to send. + processors: Optional processors list (not used here; included for + interface parity with the core `LLM` API). + participant: Optional participant metadata (ignored here). + + Returns: + Empty LLMResponseEvent since Realtime API is event-driven. + """ from vision_agents.core.llm.llm import LLMResponseEvent - await self.simple_response(text, processors, participant) - # Return empty LLMResponseEvent since Realtime API is event-driven + await self.rtc.send_text(text) return LLMResponseEvent(original=None, text="")plugins/fish/vision_agents/plugins/fish/stt.py (1)
134-140: Restore_emit_error_eventbefore re-raising.The new
STT.process_audiowrapper no longer emits errors on behalf of plugins. As a result, thisraisenow bypasses both thestt_errorscounter andSTTErrorEventemission, so Fish STT errors stop surfacing in metrics/streams. Call_emit_error_event(with the participant context) before re-raising.Apply this diff:
except Exception as e: logger.error( "Error during Fish Audio transcription", exc_info=e, ) - # Let the base class handle error emission - raise + self._emit_error_event(e, "Fish Audio transcription", participant) + raise
♻️ Duplicate comments (2)
plugins/ultralytics/vision_agents/plugins/ultralytics/yolo_pose_processor.py (1)
268-270: Fix persons_detected metric to count actual detections.This still records
len(pose_results), i.e., batch entries (typically 1 per frame) rather than the number of detected people. The metric remains wrong and never reflects multiple detections or zero detections. Please sum the detections across the results before setting the metric.Apply this diff:
- timer.attributes["persons_detected"] = ( - len(pose_results) if pose_results else 0 - ) + persons_detected = sum( + len(result.keypoints.data) + for result in pose_results + if getattr(result, "keypoints", None) is not None + ) + timer.attributes["persons_detected"] = persons_detectedagents-core/vision_agents/core/llm/llm.py (1)
82-95: Increment the error counter when_simple_responsefails.The exception handler sets the timer's error attribute but never increments the
llm_errorscounter that the Grafana dashboard expects. This issue was flagged in a previous review and remains unresolved.The import needs to include
llm_errors, and the except block must call the counter'saddmethod before re-raising:- from vision_agents.core.observability.metrics import Timer, llm_latency_ms + from vision_agents.core.observability.metrics import ( + Timer, + llm_latency_ms, + llm_errors, + )except Exception as e: timer.attributes["error"] = type(e).__name__ + llm_errors.add( + 1, + { + "provider": timer.attributes["provider"], + "error": type(e).__name__, + "llm_class": timer.attributes["llm_class"], + }, + ) raise
🧹 Nitpick comments (2)
agents-core/vision_agents/core/turn_detection/turn_detection.py (1)
82-94: Consider clarifying the docstring to distinguish from the wrapper.The docstring is identical to
process_audioabove, but this is the hook that implementations override. Adding a note like "Override this method in subclasses to implement turn detection logic" would help developers understand the wrapper pattern.Apply this diff to clarify the distinction:
@abstractmethod async def detect_turn( self, audio_data: PcmData, participant: Participant, conversation: Optional[Conversation], ) -> None: - """Process the audio and trigger turn start or turn end events + """Implement turn detection logic for the given audio data. + + Override this method in subclasses to implement turn detection. + The audio processing and metrics are automatically handled by process_audio. Args: audio_data: PcmData object containing audio samples from Stream participant: Participant that's speaking, includes user data conversation: Transcription/ chat history, sometimes useful for turn detection """plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (1)
273-279: Consider grouping operations for readability.The multi-line arithmetic without visual grouping makes the calculation harder to parse. The DTX correction multiplier is significant and the formatting obscures the operation precedence.
Apply this diff to improve clarity:
- trailing_silence_ms = ( - self._silence.trailing_silence_chunks - * 512 - / 16000 - * 1000 - * 5 # DTX correction - ) + trailing_silence_ms = ( + (self._silence.trailing_silence_chunks * 512 / 16000 * 1000) + * 5 # DTX correction factor + )
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (13)
agents-core/vision_agents/core/llm/llm.py(1 hunks)agents-core/vision_agents/core/stt/stt.py(6 hunks)agents-core/vision_agents/core/tts/tts.py(3 hunks)agents-core/vision_agents/core/turn_detection/turn_detection.py(4 hunks)examples/01_simple_agent_example/simple_agent_example.py(2 hunks)plugins/aws/vision_agents/plugins/aws/aws_llm.py(1 hunks)plugins/fish/vision_agents/plugins/fish/stt.py(2 hunks)plugins/gemini/vision_agents/plugins/gemini/gemini_llm.py(3 hunks)plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py(1 hunks)plugins/openai/vision_agents/plugins/openai/openai_realtime.py(1 hunks)plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py(10 hunks)plugins/ultralytics/vision_agents/plugins/ultralytics/yolo_pose_processor.py(3 hunks)plugins/wizper/vision_agents/plugins/wizper/stt.py(3 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- agents-core/vision_agents/core/tts/tts.py
- plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py
- examples/01_simple_agent_example/simple_agent_example.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (.cursor/rules/python.mdc)
**/*.py: Do not modify sys.path in Python code
Docstrings must follow the Google style guide
Files:
agents-core/vision_agents/core/stt/stt.pyplugins/aws/vision_agents/plugins/aws/aws_llm.pyplugins/openai/vision_agents/plugins/openai/openai_realtime.pyagents-core/vision_agents/core/turn_detection/turn_detection.pyplugins/gemini/vision_agents/plugins/gemini/gemini_llm.pyplugins/ultralytics/vision_agents/plugins/ultralytics/yolo_pose_processor.pyplugins/wizper/vision_agents/plugins/wizper/stt.pyagents-core/vision_agents/core/llm/llm.pyplugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.pyplugins/fish/vision_agents/plugins/fish/stt.py
🧬 Code graph analysis (10)
agents-core/vision_agents/core/stt/stt.py (7)
agents-core/vision_agents/core/observability/metrics.py (1)
Timer(62-199)agents-core/vision_agents/core/events/manager.py (1)
send(428-472)agents-core/vision_agents/core/tts/tts.py (1)
send(202-302)agents-core/vision_agents/core/stt/events.py (3)
STTTranscriptEvent(16-47)STTPartialTranscriptEvent(51-78)STTErrorEvent(82-94)plugins/fish/vision_agents/plugins/fish/stt.py (1)
_process_audio(53-140)plugins/wizper/vision_agents/plugins/wizper/stt.py (1)
_process_audio(62-139)plugins/deepgram/vision_agents/plugins/deepgram/deepgram_stt.py (1)
_process_audio(78-114)
plugins/aws/vision_agents/plugins/aws/aws_llm.py (1)
agents-core/vision_agents/core/llm/llm.py (2)
_simple_response(98-108)LLMResponseEvent(38-42)
plugins/openai/vision_agents/plugins/openai/openai_realtime.py (5)
plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py (2)
_simple_response(136-152)simple_response(119-134)agents-core/vision_agents/core/llm/llm.py (3)
_simple_response(98-108)LLMResponseEvent(38-42)simple_response(73-95)agents-core/vision_agents/core/edge/sfu_events.py (11)
participant(1496-1501)participant(1504-1507)participant(1545-1550)participant(1553-1556)participant(1625-1630)participant(1633-1636)participant(2100-2105)participant(2108-2111)participant(2156-2161)participant(2164-2167)Participant(229-270)agents-core/vision_agents/core/edge/types.py (1)
Participant(22-24)agents-core/vision_agents/core/agents/agents.py (1)
simple_response(292-305)
agents-core/vision_agents/core/turn_detection/turn_detection.py (6)
agents-core/vision_agents/core/observability/metrics.py (1)
Timer(62-199)agents-core/vision_agents/core/turn_detection/events.py (2)
TurnStartedEvent(11-25)TurnEndedEvent(29-45)agents-core/vision_agents/core/events/manager.py (1)
send(428-472)plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (1)
detect_turn(159-171)plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py (1)
detect_turn(182-193)plugins/krisp/vision_agents/plugins/krisp/turn_detection.py (1)
detect_turn(78-119)
plugins/gemini/vision_agents/plugins/gemini/gemini_llm.py (2)
plugins/aws/vision_agents/plugins/aws/aws_llm.py (1)
_simple_response(93-113)agents-core/vision_agents/core/llm/llm.py (2)
_simple_response(98-108)LLMResponseEvent(38-42)
plugins/ultralytics/vision_agents/plugins/ultralytics/yolo_pose_processor.py (1)
agents-core/vision_agents/core/observability/metrics.py (1)
Timer(62-199)
plugins/wizper/vision_agents/plugins/wizper/stt.py (2)
agents-core/vision_agents/core/stt/stt.py (1)
_process_audio(161-176)agents-core/vision_agents/core/edge/types.py (1)
Participant(22-24)
agents-core/vision_agents/core/llm/llm.py (3)
agents-core/vision_agents/core/observability/metrics.py (1)
Timer(62-199)agents-core/vision_agents/core/processors/base_processor.py (1)
Processor(35-43)agents-core/vision_agents/core/edge/types.py (1)
Participant(22-24)
plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (3)
agents-core/vision_agents/core/observability/metrics.py (1)
Timer(62-199)agents-core/vision_agents/core/turn_detection/turn_detection.py (1)
detect_turn(82-94)plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py (5)
detect_turn(182-193)_process_audio_packet(216-342)predict_speech(518-546)_predict_turn_completed(405-456)SileroVAD(483-546)
plugins/fish/vision_agents/plugins/fish/stt.py (3)
agents-core/vision_agents/core/stt/stt.py (1)
_process_audio(161-176)plugins/wizper/vision_agents/plugins/wizper/stt.py (1)
_process_audio(62-139)plugins/deepgram/vision_agents/plugins/deepgram/deepgram_stt.py (1)
_process_audio(78-114)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: unit / Ruff & mypy
- GitHub Check: unit / Test "not integration"
- GitHub Check: unit / Test "not integration"
- GitHub Check: unit / Ruff & mypy
🔇 Additional comments (13)
plugins/aws/vision_agents/plugins/aws/aws_llm.py (1)
93-113: LGTM!The refactoring to
_simple_responsealigns with the framework's new metrics-wrapped pattern. The delegation toconverse_streamis appropriate, and the return type annotation improves type safety.plugins/gemini/vision_agents/plugins/gemini/gemini_llm.py (3)
62-62: LGTM!The
provider_nameattribute enables proper metrics tagging in the base class wrapper (line 88 ofllm.py). The value correctly identifies the Gemini provider.
69-86: LGTM!The
_simple_responseimplementation correctly matches the base class signature and appropriately delegates to the provider-specificsend_messagemethod.
88-88: LGTM!The method signature correction to
async defaligns with its actual async implementation (usesawaitthroughout). This improves type correctness without breaking existing callers.agents-core/vision_agents/core/llm/llm.py (1)
97-108: LGTM!The abstract
_simple_responsemethod establishes a clear contract for provider implementations. The signature is consistent across all providers reviewed (AWS, Gemini), and the decorator properly enforces implementation in subclasses.agents-core/vision_agents/core/turn_detection/turn_detection.py (2)
11-11: LGTM! Observability imports properly added.The import of Timer and metrics objects supports the instrumentation pattern introduced in this PR.
67-79: Well-implemented instrumentation wrapper with error tracking.The Timer context manager correctly records latency with class and provider attributes, and the exception handling properly increments error metrics before re-raising. This pattern effectively separates observability concerns from implementation logic.
plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (6)
15-19: LGTM! Observability metrics imported.The imports support the Timer-based instrumentation added to VAD and turn completion prediction paths.
159-171: Correctly renamed to conform to base class pattern.The method now implements the abstract
detect_turnhook introduced in the base TurnDetector class, withprocess_audioserving as the instrumentation wrapper.
178-209: Excellent lifecycle management with proper cleanup guarantees.The structured try/except/finally ensures
_processing_activeis cleared when the queue empties and both flags are reset on shutdown, even during cancellation. This directly addresses the past review concern about lifecycle transitions hanging.
253-257: Timer instrumentation captures VAD metrics effectively.Recording sample count and implementation type provides useful observability into the VAD prediction path.
295-304: Comprehensive turn completion instrumentation.The Timer captures duration, sample count, trailing silence, prediction value, and the derived turn_ended boolean—excellent coverage for debugging and performance analysis.
432-450: Flexible model loading with optional pre-loaded bytes.The
model_bytesparameter enables future optimizations (e.g., loading models once and sharing across instances) while maintaining backward compatibility through the default None fallback to file-based loading.
| timer.attributes["sample_rate"] = pcm_data.sample_rate | ||
| timer.attributes["channels"] = pcm_data.channels | ||
| timer.attributes["samples"] = ( | ||
| len(pcm_data.samples) if pcm_data.samples is not None else 0 | ||
| ) | ||
| timer.attributes["duration_ms"] = pcm_data.duration_ms | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Confirm PcmData exposes duration_ms.
getstream.video.rtc.track_util.PcmData has historically exposed .duration, not .duration_ms. If duration_ms is missing, this attribute access will raise and the metrics wrapper will crash before the plugin runs. Please verify the attribute exists (or derive the duration in milliseconds defensively).
🏁 Script executed:
#!/bin/bash
# Inspect PcmData for a duration_ms attribute.
rg -n "duration_ms" -g 'track_util.py' -C3Length of output: 148
I need to search more broadly for the PcmData class definition. Let me try a different approach:
---
🏁 Script executed:
```shell
#!/bin/bash
# Find track_util.py files in the repository
fd -t f "track_util.py"
Length of output: 0
🏁 Script executed:
#!/bin/bash
# Search for PcmData class definition
rg -n "class PcmData" -A 20 -B 2Length of output: 0
🏁 Script executed:
#!/bin/bash
# Search for duration_ms usage in the stt.py file
rg -n "duration_ms" agents-core/vision_agents/core/stt/stt.py -C 5Length of output: 552
🏁 Script executed:
#!/bin/bash
# Search for PcmData attribute access patterns
rg -n "pcm_data\." agents-core/vision_agents/core/stt/stt.py -C 2Length of output: 463
🏁 Script executed:
#!/bin/bash
# Look for how duration is referenced elsewhere with pcm_data
rg -n "pcm_data.*duration" -C 3Length of output: 1655
🏁 Script executed:
#!/bin/bash
# Search for PcmData class or dataclass definition anywhere
rg -n "PcmData" -B 2 -A 10 --type py | head -60Length of output: 3286
🏁 Script executed:
#!/bin/bash
# Check imports in the stt.py file to understand where PcmData comes from
rg -n "import|from" agents-core/vision_agents/core/stt/stt.py | head -30Length of output: 451
🏁 Script executed:
#!/bin/bash
# Search for @property definitions related to duration
rg -n "@property" -A 3 | grep -A 3 "duration"Length of output: 804
🏁 Script executed:
#!/bin/bash
# Look for duration conversion or duration_ms definition
rg -n "duration_ms" -B 5 -A 2 --type pyLength of output: 41672
🏁 Script executed:
#!/bin/bash
# Search for any @property or method definitions that might create duration_ms
rg -n "def duration" --type py -B 2 -A 5Length of output: 0
🏁 Script executed:
#!/bin/bash
# Search for any wrapper classes or extensions of PcmData in the codebase
rg -n "class.*PcmData|extends.*PcmData" --type py -B 2 -A 10Length of output: 770
🏁 Script executed:
#!/bin/bash
# Check if there's a getstream library reference or version specification
fd -t f "requirements*.txt" -o "pyproject.toml" -o "setup.py" -o "poetry.lock" | head -10 | xargs grep -l "getstream" 2>/dev/nullLength of output: 147
🏁 Script executed:
#!/bin/bash
# Look at imports and see if PcmData is used elsewhere or extended
rg -n "from getstream" --type pyLength of output: 10670
🏁 Script executed:
#!/bin/bash
# Find dependency files to check getstream version
find . -maxdepth 3 -type f \( -name "pyproject.toml" -o -name "requirements*.txt" -o -name "setup.py" \) | head -20Length of output: 701
🏁 Script executed:
#!/bin/bash
# Search for any dynamic property assignment or getattr usage with duration
rg -n "duration_ms|setattr|__getattr__" --type py -B 3 -A 3 | grep -A 5 -B 5 "PcmData"Length of output: 3250
🏁 Script executed:
#!/bin/bash
# Look at the actual test to understand what PcmData.from_bytes returns
rg -n "from_bytes" agents-core/ --type py -A 10Length of output: 966
🏁 Script executed:
#!/bin/bash
# Check the getstream dependency version in agents-core
cat ./agents-core/pyproject.toml | grep -A 20 "dependencies"Length of output: 1241
🏁 Script executed:
#!/bin/bash
# Also check if there's a poetry.lock or requirements file that pins the version
cat ./agents-core/pyproject.toml | grep -i "getstream"Length of output: 332
🌐 Web query:
getstream PcmData duration_ms attribute python
💡 Result:
GetStream's PcmData (Python) exposes duration in seconds as .duration — there isn't a documented .duration_ms attribute. Compute milliseconds either from .duration or from samples/sample_rate:
-
Using .duration: duration_ms = pcm.duration * 1000. (getstream.io)
-
From samples: duration_ms = (len(pcm.samples) / pcm.sample_rate) * 1000 (adjust for channels/bytes-per-sample if needed). (getstream.io)
If you want, tell me which PcmData fields you see and I’ll give the exact one-line code.
Citations:
- 1: https://getstream.io/video/docs/python-ai/guides/add-mcp-client-function-calling/?utm_source=openai
- 2: https://getstream.io/video/docs/python-ai/tutorials/build-an-ai-conversation-pipeline/?utm_source=openai
pcm_data.duration_ms does not exist on GetStream's PcmData class; use pcm_data.duration * 1000 instead.
GetStream's PcmData exposes duration in seconds as .duration, not .duration_ms. Line 152 will crash with AttributeError when the metrics wrapper accesses this attribute. Compute milliseconds defensively:
timer.attributes["duration_ms"] = pcm_data.duration * 1000 if pcm_data.duration is not None else 0This same pattern appears elsewhere in the codebase (e.g., agents-core/vision_agents/core/vad/vad.py, agents-core/vision_agents/core/tts/tts.py, plugins/), all accessing the non-existent .duration_ms attribute.
🤖 Prompt for AI Agents
In agents-core/vision_agents/core/stt/stt.py around lines 147 to 153, the code
reads a non-existent pcm_data.duration_ms attribute which will raise
AttributeError; replace it with a defensive computation using pcm_data.duration
* 1000 when duration is present, otherwise 0, and apply the same defensive
pattern where else duration_ms is used in the repo; update
timer.attributes["duration_ms"] to compute milliseconds from pcm_data.duration
(guarding against None).
| async def _simple_response( | ||
| self, | ||
| text: str, | ||
| processors: Optional[List[Processor]] = None, | ||
| participant: Optional[Participant] = None, | ||
| ): | ||
| """ | ||
| Internal simple response implementation required by LLM base class. | ||
| Note: OpenAI Realtime is event-driven and doesn't return responses directly. | ||
| This implementation sends the text via the public simple_response method. | ||
| """ | ||
| from vision_agents.core.llm.llm import LLMResponseEvent | ||
|
|
||
| await self.simple_response(text, processors, participant) | ||
| # Return empty LLMResponseEvent since Realtime API is event-driven | ||
| return LLMResponseEvent(original=None, text="") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Implementation will be correct once the public override is removed.
After removing the public simple_response override as suggested above, this internal implementation will properly integrate with the base class metrics wrapper. The pattern of returning an empty LLMResponseEvent correctly reflects the event-driven nature of the Realtime API.
Note: The same bypass issue exists in the Gemini plugin (plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py, lines 118-151). Please verify and apply the same fix across all Realtime implementations to ensure consistent metrics coverage.
🏁 Script executed:
#!/bin/bash
# Description: Find all Realtime plugin implementations that override simple_response
# to ensure metrics bypass is fixed consistently across providers.
# Search for classes that override simple_response (public method)
ast-grep --pattern $'class $_(realtime.Realtime):
$$$
async def simple_response($$$):
$$$
$$$'Length of output: 196547
Remove public simple_response override to ensure metrics coverage across all Realtime implementations.
The _simple_response implementation is correctly structured. However, to achieve the goal, the public simple_response method override (lines 108-127) must be removed. This allows the base class metrics wrapper to properly invoke _simple_response and track all interactions.
Verified: This same pattern exists across all Realtime plugin implementations:
- Gemini Realtime has an identical public override (lines 119-134) that should be removed
- AWS Realtime also has a public override (lines 271-286) and is missing a
_simple_responseimplementation entirely
All three providers need the same structural fix to ensure consistent metrics coverage.
🤖 Prompt for AI Agents
In plugins/openai/vision_agents/plugins/openai/openai_realtime.py around lines
129-145 (and remove the public simple_response override at lines ~108-127),
delete the public simple_response override so the base class metrics wrapper can
call the protected _simple_response; keep the existing async _simple_response
implementation as-is (which sends via simple_response internally) and ensure no
public override shadows the base implementation so metrics are recorded
properly.
How to play with this
uv add --dev opentelemetry-exporter-prometheus prometheus-client opentelemetry-sdk opentelemetry-exporter-otlpNote: Do not change the Grafana directly, ask AI to change it for you (everything lives under observability and has a README)!
See turn_detection.py to see an example of metrics collection
Summary by CodeRabbit
New Features
Documentation
Tests
Chores