Skip to content

Conversation

@tbarbugli
Copy link
Member

@tbarbugli tbarbugli commented Oct 31, 2025

  • LLM metrics
  • STT metrics
  • Turn metrics
  • Built-in docker-compose with Grafana
  • Track end-to-end latency
  • Measure video processing latency
  • Measure latency for Realtime models
  • Measure tokens usage

How to play with this

  1. Run docker compose up
  2. uv add --dev opentelemetry-exporter-prometheus prometheus-client opentelemetry-sdk opentelemetry-exporter-otlp
  3. Run the simple agent example
  4. Grafana -> http://localhost:3000
  5. Python metrics -> http://localhost:9464/
  6. Prometheus -> http://localhost:9090/

Note: Do not change the Grafana directly, ask AI to change it for you (everything lives under observability and has a README)!

See turn_detection.py to see an example of metrics collection

Summary by CodeRabbit

  • New Features

    • Full observability: Prometheus, Grafana, and Jaeger stack with prebuilt dashboards and local docker deployment.
    • App-level metrics and a reusable Timer utility capturing latencies and contextual metadata across STT, TTS, LLM, turn detection, and vision components.
    • Telemetry example now exposes Prometheus metrics via an HTTP endpoint.
  • Documentation

    • Added observability quickstart, dashboards, and troubleshooting guide.
  • Tests

    • Added comprehensive tests covering timing/metrics behavior.
  • Chores

    • Added compose, provisioning, and Grafana init scripts.

@coderabbitai
Copy link

coderabbitai bot commented Oct 31, 2025

Walkthrough

Adds OpenTelemetry-based observability: tracer/meter and many metrics, a reusable Timer utility (context/decorator/callable) with tests, instruments core LLM/STT/TTS/TurnDetection flows to record latency/errors, updates plugins to the new _simple_response/_process_audio/detect_turn hooks, and provides an observability stack (Prometheus/Grafana/Jaeger) and dashboards.

Changes

Cohort / File(s) Summary
Timer & Tests
agents-core/tests/test_timer.py, agents-core/vision_agents/core/observability/metrics.py
Adds OpenTelemetry tracer/meter, many histograms/counters/updowncounter, Timer utility (context/decorator/callable) and helper to infer class names; comprehensive Timer unit tests.
Core LLM instrumentation
agents-core/vision_agents/core/llm/llm.py
Adds abstract _simple_response hook; simple_response wrapper records llm_latency_ms (with class/provider attrs) and captures exceptions into metrics.
Core STT/Turn/TTS instrumentation
agents-core/vision_agents/core/stt/stt.py, agents-core/vision_agents/core/turn_detection/turn_detection.py, agents-core/vision_agents/core/tts/tts.py
Wraps entry points with Timer to record latency and error counters; introduces _process_audio/detect_turn abstract hooks and increments error counters with provider/error_type.
LLM plugin alignment
plugins/*/*/*_llm.py, plugins/*/*/*_realtime.py (e.g., plugins/anthropic/..., plugins/aws/..., plugins/gemini/..., plugins/openai/..., plugins/openrouter/..., plugins/xai/...)
Renames simple_response_simple_response in many providers, adds provider_name attributes, and adds _simple_response stubs for Realtime implementations; formatting/signature refinements.
STT plugin alignment
plugins/deepgram/.../deepgram_stt.py, plugins/fish/.../stt.py, plugins/wizper/.../stt.py
Renames public process_audio_process_audio (visibility change) and minor formatting tweaks.
Turn-detection plugins
plugins/krisp/.../turn_detection.py, plugins/smart_turn/.../smart_turn_detection.py, plugins/vogent/.../vogent_turn_detection.py
Rename process_audiodetect_turn (add conversation param), add Timer instrumentation for VAD/end-detection, introduce provider-specific histograms (e.g., vogent_whisper_latency_ms), and strengthen lifecycle/error handling.
YOLO / vision plugin
plugins/ultralytics/vision_agents/plugins/ultralytics/yolo_pose_processor.py
Removes default width/height constants, adds yolo_pose_inference_ms histogram and Timer around model inference; records frame dims, conf threshold, device, and persons_detected.
Examples & telemetry setup
examples/01_simple_agent_example/simple_agent_example.py
Adds Prometheus MeterProvider and starts Prometheus metric reader/server alongside existing OTLP trace setup.
Observability infra & docs
docker-compose.yml, observability/* (README, grafana dashboards/shell, provisioning, prometheus.yml, .gitignore)
Adds docker-compose observability stack, Prometheus/Grafana provisioning and dashboard JSON, init script to set home dashboard, Prometheus scrape config, README and .gitignore.
Misc plugin formatting & minor renames
multiple plugin files (e.g., plugins/gemini/.../gemini_realtime.py, plugins/openai/.../openai_realtime.py, others listed)
Adds provider_name attributes, renames/moves public entry points to private hooks, and adjusts formatting/signatures; behavior largely preserved where not instrumented.
Tests updated for new hooks
tests/test_function_calling.py
Adds TestLLM implementing _simple_response and updates tests to use it.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant CoreLLM as LLM (core)
    participant Timer
    participant Meter as Meter(OTel)
    participant Impl as LLM Impl (_simple_response)

    Client->>CoreLLM: simple_response(text)
    activate CoreLLM
    CoreLLM->>Timer: start(llm_latency_ms, attrs)
    activate Timer
    CoreLLM->>Impl: _simple_response(text)
    activate Impl
    Impl-->>CoreLLM: LLMResponseEvent / raises
    deactivate Impl
    alt success
        CoreLLM->>Timer: stop()
        Timer->>Meter: record(elapsed_ms, attributes)
    else error
        CoreLLM->>Meter: llm_errors.add(1, {provider, error_type})
        CoreLLM->>Timer: stop(exception=True)
        Timer->>Meter: record(elapsed_ms, attributes_with_exception)
    end
    CoreLLM-->>Client: LLMResponseEvent or raise
    deactivate CoreLLM
Loading
sequenceDiagram
    participant Caller
    participant CoreSTT as STT (core)
    participant Timer
    participant Meter as Meter(OTel)
    participant Impl as STT Impl (_process_audio)
    participant Counter as stt_errors

    Caller->>CoreSTT: process_audio(pcm)
    activate CoreSTT
    CoreSTT->>Timer: start(stt_latency_ms, attrs)
    activate Timer
    CoreSTT->>Impl: _process_audio(pcm)
    activate Impl
    Impl-->>CoreSTT: success / raises
    deactivate Impl
    alt success
        CoreSTT->>Timer: stop()
        Timer->>Meter: record(elapsed_ms, attributes)
    else error
        CoreSTT->>Counter: add(1, {provider, error_type})
        CoreSTT->>Timer: stop(exception=True)
        Timer->>Meter: record(elapsed_ms, attributes_with_exception)
    end
    CoreSTT-->>Caller: returns / raises
    deactivate CoreSTT
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Areas to focus review on:

  • Timer implementation correctness: idempotent stop, elapsed retrieval while running, attribute merging, decorator/context/async support.
  • Refactor correctness: ensure all implementations updated to new hooks (_simple_response, _process_audio, detect_turn) and Realtime stubs behave as intended.
  • Metric attribute consistency and naming across instrumentation sites (class_name/provider labels).
  • Turn-detection signature changes and conversation propagation across plugins.
  • Prometheus/Grafana provisioning and dashboard JSON for correctness of metric names and label usage.

Possibly related PRs

Suggested labels

tests

Suggested reviewers

  • d3xvn
  • dangusev
  • yarikdevcom

Poem

The meter hums — a brittle, bright incision,
timers counting the small deaths of seconds,
numbers like teeth shining in a wired throat,
providers name themselves into the ledger,
and silence learns to answer with a trace.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 73.02% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Add more metrics' is directly related to the main changeset, which implements comprehensive metrics collection (LLM, STT, turn detection) across the codebase and adds observability infrastructure.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch more-metrics

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between c550070 and 1193533.

📒 Files selected for processing (1)
  • tests/test_function_calling.py (9 hunks)
✅ Files skipped from review due to trivial changes (1)
  • tests/test_function_calling.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: unit / Ruff & mypy
  • GitHub Check: unit / Test "not integration"
  • GitHub Check: unit / Test "not integration"
  • GitHub Check: unit / Ruff & mypy

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
plugins/krisp/vision_agents/plugins/krisp/turn_detection.py (1)

111-117: Fix type mismatch when dispatching PCM frames

process_pcm_turn_taking expects a PcmData object (it reads sample_rate and samples), but detect_turn passes a raw ndarray. The first call to pcm.sample_rate will throw an AttributeError, so turn detection never runs. Keep the PcmData wrapper when resampling.

-        resampled_pcm = audio_data.resample(16_000, 1).samples
+        resampled_pcm = audio_data.resample(16_000, 1)
 
         try:
             loop = asyncio.get_event_loop()
             await loop.run_in_executor(
-                None, self.process_pcm_turn_taking, resampled_pcm, user_id, metadata
+                None, self.process_pcm_turn_taking, resampled_pcm, user_id, metadata
             )
plugins/fish/vision_agents/plugins/fish/stt.py (1)

63-70: Align docstring with the actual signature

The docstring still references user_metadata and returning a list of tuples, but the method only takes participant and returns None. Please update the Args/Returns sections to reflect reality so the Google-style doc stays accurate.

🧹 Nitpick comments (3)
plugins/ultralytics/vision_agents/plugins/ultralytics/yolo_pose_processor.py (1)

32-35: Clean up duplicate default resolution assignments.

Line 32 redefines DEFAULT_WIDTH/DEFAULT_HEIGHT twice, leaving both 640x480 and 1920x1080 literals in place. Please keep only the intended defaults so the module state is unambiguous.

plugins/wizper/vision_agents/plugins/wizper/stt.py (1)

96-96: TODO indicates known service limitation.

Line 96 documents a bug in the fal-ai/wizper service where only "transcribe" task works. Consider tracking this with an issue if not already done, as the task and target_language constructor parameters suggest broader intended functionality.

Would you like me to create an issue to track this limitation and monitor for when the fal-ai/wizper service fixes the bug?

observability/grafana/init-home-dashboard.sh (1)

26-30: Hardcoded credentials acceptable for local development.

The use of admin:admin credentials is appropriate for the local Docker Compose development setup described in the PR. For production deployments, these credentials should be externalized via environment variables.

Based on learnings

To make this more flexible, consider this enhancement:

+GRAFANA_USER="${GRAFANA_USER:-admin}"
+GRAFANA_PASSWORD="${GRAFANA_PASSWORD:-admin}"
+
 echo ""
 echo "Setting admin user home dashboard..."
 curl -X PUT \
-  -u "admin:admin" \
+  -u "${GRAFANA_USER}:${GRAFANA_PASSWORD}" \
   -H "Content-Type: application/json" \
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between d78a4a0 and 2755b9d.

📒 Files selected for processing (27)
  • agents-core/tests/test_timer.py (1 hunks)
  • agents-core/vision_agents/core/llm/llm.py (9 hunks)
  • agents-core/vision_agents/core/observability/metrics.py (2 hunks)
  • agents-core/vision_agents/core/stt/stt.py (5 hunks)
  • agents-core/vision_agents/core/tts/tts.py (1 hunks)
  • agents-core/vision_agents/core/turn_detection/turn_detection.py (4 hunks)
  • examples/01_simple_agent_example/simple_agent_example.py (5 hunks)
  • observability/.gitignore (1 hunks)
  • observability/README.md (1 hunks)
  • observability/grafana/dashboards/vision-agents.json (1 hunks)
  • observability/grafana/init-home-dashboard.sh (1 hunks)
  • observability/grafana/provisioning/dashboards/default.yml (1 hunks)
  • observability/grafana/provisioning/datasources/prometheus.yml (1 hunks)
  • observability/prometheus/prometheus.yml (1 hunks)
  • plugins/anthropic/vision_agents/plugins/anthropic/anthropic_llm.py (15 hunks)
  • plugins/aws/vision_agents/plugins/aws/aws_llm.py (18 hunks)
  • plugins/deepgram/vision_agents/plugins/deepgram/deepgram_stt.py (7 hunks)
  • plugins/fish/vision_agents/plugins/fish/stt.py (2 hunks)
  • plugins/gemini/vision_agents/plugins/gemini/gemini_llm.py (12 hunks)
  • plugins/krisp/vision_agents/plugins/krisp/turn_detection.py (1 hunks)
  • plugins/openai/vision_agents/plugins/openai/openai_llm.py (2 hunks)
  • plugins/openrouter/vision_agents/plugins/openrouter/openrouter_llm.py (4 hunks)
  • plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (10 hunks)
  • plugins/ultralytics/vision_agents/plugins/ultralytics/yolo_pose_processor.py (2 hunks)
  • plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py (16 hunks)
  • plugins/wizper/vision_agents/plugins/wizper/stt.py (1 hunks)
  • plugins/xai/vision_agents/plugins/xai/llm.py (6 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (.cursor/rules/python.mdc)

**/*.py: Do not modify sys.path in Python code
Docstrings must follow the Google style guide

Files:

  • plugins/fish/vision_agents/plugins/fish/stt.py
  • plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py
  • plugins/wizper/vision_agents/plugins/wizper/stt.py
  • plugins/openrouter/vision_agents/plugins/openrouter/openrouter_llm.py
  • agents-core/vision_agents/core/tts/tts.py
  • plugins/xai/vision_agents/plugins/xai/llm.py
  • plugins/ultralytics/vision_agents/plugins/ultralytics/yolo_pose_processor.py
  • plugins/krisp/vision_agents/plugins/krisp/turn_detection.py
  • agents-core/tests/test_timer.py
  • agents-core/vision_agents/core/turn_detection/turn_detection.py
  • examples/01_simple_agent_example/simple_agent_example.py
  • agents-core/vision_agents/core/stt/stt.py
  • plugins/gemini/vision_agents/plugins/gemini/gemini_llm.py
  • plugins/anthropic/vision_agents/plugins/anthropic/anthropic_llm.py
  • plugins/openai/vision_agents/plugins/openai/openai_llm.py
  • plugins/deepgram/vision_agents/plugins/deepgram/deepgram_stt.py
  • agents-core/vision_agents/core/observability/metrics.py
  • agents-core/vision_agents/core/llm/llm.py
  • plugins/aws/vision_agents/plugins/aws/aws_llm.py
  • plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py
🧠 Learnings (1)
📚 Learning: 2025-10-13T22:00:34.300Z
Learnt from: dangusev
Repo: GetStream/Vision-Agents PR: 98
File: plugins/deepgram/vision_agents/plugins/deepgram/stt.py:135-150
Timestamp: 2025-10-13T22:00:34.300Z
Learning: In the Deepgram STT plugin (plugins/deepgram/vision_agents/plugins/deepgram/stt.py), the `started()` method is designed to wait for the connection attempt to complete, not to guarantee a successful connection. It's acceptable for the connection attempt to fail, and downstream code handles the case where `self.dg_connection` is `None`. The `_connected_once` event is set in the `finally` block intentionally to signal attempt completion.

Applied to files:

  • plugins/deepgram/vision_agents/plugins/deepgram/deepgram_stt.py
🧬 Code graph analysis (17)
plugins/fish/vision_agents/plugins/fish/stt.py (3)
agents-core/vision_agents/core/stt/stt.py (2)
  • _process_audio (151-166)
  • _emit_transcript_event (41-63)
plugins/deepgram/vision_agents/plugins/deepgram/deepgram_stt.py (1)
  • _process_audio (78-114)
plugins/wizper/vision_agents/plugins/wizper/stt.py (1)
  • _process_audio (60-128)
plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (3)
agents-core/vision_agents/core/observability/metrics.py (1)
  • Timer (62-199)
agents-core/vision_agents/core/turn_detection/turn_detection.py (1)
  • detect_turn (72-84)
plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py (3)
  • detect_turn (182-193)
  • predict_speech (518-546)
  • _predict_turn_completed (405-456)
plugins/wizper/vision_agents/plugins/wizper/stt.py (3)
agents-core/vision_agents/core/stt/stt.py (1)
  • _process_audio (151-166)
plugins/deepgram/vision_agents/plugins/deepgram/deepgram_stt.py (1)
  • _process_audio (78-114)
plugins/fish/vision_agents/plugins/fish/stt.py (1)
  • _process_audio (52-138)
plugins/xai/vision_agents/plugins/xai/llm.py (4)
agents-core/vision_agents/core/llm/events.py (2)
  • LLMResponseChunkEvent (90-105)
  • LLMResponseCompletedEvent (109-115)
agents-core/vision_agents/core/llm/llm.py (2)
  • _simple_response (88-98)
  • LLMResponseEvent (35-39)
plugins/openai/vision_agents/plugins/openai/openai_llm.py (1)
  • _simple_response (81-109)
plugins/xai/vision_agents/plugins/xai/events.py (1)
  • XAIChunkEvent (7-10)
plugins/ultralytics/vision_agents/plugins/ultralytics/yolo_pose_processor.py (1)
agents-core/vision_agents/core/observability/metrics.py (1)
  • Timer (62-199)
plugins/krisp/vision_agents/plugins/krisp/turn_detection.py (3)
agents-core/vision_agents/core/turn_detection/turn_detection.py (1)
  • detect_turn (72-84)
plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (1)
  • detect_turn (158-170)
plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py (1)
  • detect_turn (182-193)
agents-core/tests/test_timer.py (2)
agents-core/vision_agents/core/observability/metrics.py (4)
  • Timer (62-199)
  • elapsed_ms (143-144)
  • stop (125-141)
  • _restart (146-150)
agents-core/vision_agents/core/stt/stt.py (1)
  • process_audio (116-148)
agents-core/vision_agents/core/turn_detection/turn_detection.py (4)
agents-core/vision_agents/core/observability/metrics.py (1)
  • Timer (62-199)
plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (1)
  • detect_turn (158-170)
plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py (1)
  • detect_turn (182-193)
plugins/krisp/vision_agents/plugins/krisp/turn_detection.py (1)
  • detect_turn (78-119)
examples/01_simple_agent_example/simple_agent_example.py (1)
plugins/krisp/vision_agents/plugins/krisp/turn_detection.py (1)
  • TurnDetection (34-202)
agents-core/vision_agents/core/stt/stt.py (4)
agents-core/vision_agents/core/observability/metrics.py (1)
  • Timer (62-199)
agents-core/vision_agents/core/tts/tts.py (1)
  • send (238-343)
agents-core/vision_agents/core/turn_detection/turn_detection.py (1)
  • process_audio (44-69)
plugins/deepgram/vision_agents/plugins/deepgram/deepgram_stt.py (1)
  • _process_audio (78-114)
plugins/gemini/vision_agents/plugins/gemini/gemini_llm.py (6)
agents-core/vision_agents/core/llm/events.py (2)
  • LLMResponseCompletedEvent (109-115)
  • LLMResponseChunkEvent (90-105)
agents-core/vision_agents/core/llm/llm.py (5)
  • _simple_response (88-98)
  • LLMResponseEvent (35-39)
  • _dedup_and_execute (379-413)
  • _convert_tools_to_provider_format (139-153)
  • _extract_tool_calls_from_response (155-169)
agents-core/vision_agents/core/agents/conversation.py (1)
  • Message (14-24)
plugins/gemini/vision_agents/plugins/gemini/events.py (1)
  • GeminiResponseEvent (35-38)
plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py (2)
  • _convert_tools_to_provider_format (509-532)
  • _extract_tool_calls_from_response (534-569)
agents-core/vision_agents/core/llm/llm_types.py (2)
  • ToolSchema (64-67)
  • NormalizedToolCallItem (107-111)
plugins/anthropic/vision_agents/plugins/anthropic/anthropic_llm.py (3)
agents-core/vision_agents/core/llm/events.py (2)
  • LLMResponseChunkEvent (90-105)
  • LLMResponseCompletedEvent (109-115)
agents-core/vision_agents/core/llm/llm.py (7)
  • _simple_response (88-98)
  • _dedup_and_execute (379-413)
  • _sanitize_tool_output (415-426)
  • _convert_tools_to_provider_format (139-153)
  • _extract_tool_calls_from_response (155-169)
  • LLMResponseEvent (35-39)
  • _create_tool_result_message (187-202)
agents-core/vision_agents/core/llm/llm_types.py (2)
  • NormalizedToolCallItem (107-111)
  • ToolSchema (64-67)
plugins/openai/vision_agents/plugins/openai/openai_llm.py (5)
agents-core/vision_agents/core/llm/llm.py (1)
  • _simple_response (88-98)
plugins/anthropic/vision_agents/plugins/anthropic/anthropic_llm.py (1)
  • _simple_response (75-95)
plugins/aws/vision_agents/plugins/aws/aws_llm.py (1)
  • _simple_response (81-101)
plugins/gemini/vision_agents/plugins/gemini/gemini_llm.py (1)
  • _simple_response (69-86)
plugins/xai/vision_agents/plugins/xai/llm.py (1)
  • _simple_response (71-96)
plugins/deepgram/vision_agents/plugins/deepgram/deepgram_stt.py (1)
agents-core/vision_agents/core/stt/stt.py (1)
  • _process_audio (151-166)
agents-core/vision_agents/core/llm/llm.py (5)
agents-core/vision_agents/core/observability/metrics.py (1)
  • Timer (62-199)
plugins/openai/vision_agents/plugins/openai/openai_llm.py (4)
  • _simple_response (81-109)
  • _convert_tools_to_provider_format (372-404)
  • _extract_tool_calls_from_response (406-433)
  • _create_tool_result_message (435-465)
agents-core/vision_agents/core/llm/llm_types.py (2)
  • ToolSchema (64-67)
  • NormalizedToolCallItem (107-111)
agents-core/vision_agents/core/llm/function_registry.py (1)
  • get_tool_schemas (108-114)
agents-core/vision_agents/core/llm/events.py (2)
  • ToolStartEvent (119-125)
  • ToolEndEvent (129-138)
plugins/aws/vision_agents/plugins/aws/aws_llm.py (5)
plugins/aws/tests/test_aws.py (1)
  • llm (37-44)
agents-core/vision_agents/core/llm/events.py (2)
  • LLMResponseChunkEvent (90-105)
  • LLMResponseCompletedEvent (109-115)
agents-core/vision_agents/core/llm/llm.py (4)
  • _simple_response (88-98)
  • LLMResponseEvent (35-39)
  • _dedup_and_execute (379-413)
  • _sanitize_tool_output (415-426)
agents-core/vision_agents/core/llm/llm_types.py (1)
  • NormalizedToolCallItem (107-111)
plugins/aws/vision_agents/plugins/aws/events.py (1)
  • AWSStreamEvent (7-10)
plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py (2)
agents-core/vision_agents/core/observability/metrics.py (1)
  • Timer (62-199)
plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (4)
  • _prepare_silero_vad (150-156)
  • SileroVAD (419-486)
  • detect_turn (158-170)
  • predict_speech (454-455)
🪛 Gitleaks (8.28.0)
observability/grafana/init-home-dashboard.sh

[high] 26-27: Discovered a potential basic authorization token provided in a curl command, which could compromise the curl accessed resource.

(curl-auth-user)

🪛 markdownlint-cli2 (0.18.1)
observability/README.md

19-19: Bare URL used

(MD034, no-bare-urls)


20-20: Bare URL used

(MD034, no-bare-urls)


21-21: Bare URL used

(MD034, no-bare-urls)


38-38: Bare URL used

(MD034, no-bare-urls)


51-51: Bare URL used

(MD034, no-bare-urls)


59-59: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


69-69: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


137-137: Bare URL used

(MD034, no-bare-urls)


138-138: Bare URL used

(MD034, no-bare-urls)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: unit / Ruff & mypy
  • GitHub Check: unit / Test "not integration"
🔇 Additional comments (15)
plugins/deepgram/vision_agents/plugins/deepgram/deepgram_stt.py (1)

78-115: LGTM! Method rename aligns with observability pattern.

The rename from process_audio to _process_audio is consistent with the broader refactoring to support metrics collection at the base class level. The docstring follows Google style, and the logic correctly handles connection readiness, resampling, and audio transmission.

plugins/anthropic/vision_agents/plugins/anthropic/anthropic_llm.py (4)

68-68: Provider name attribute added correctly.

The addition of provider_name = "anthropic" enables consistent provider identification in metrics and events across the observability stack.


75-95: LGTM! Method visibility change aligns with base class pattern.

The rename to _simple_response is consistent with the standardized approach where the base class provides a public wrapper with metrics collection, delegating to this private implementation method.


132-217: Tool calling logic is well-structured.

The multi-hop tool execution with deduplication (via _dedup_and_execute) and bounded iteration (MAX_ROUNDS=3) provides robust handling of Claude's tool use pattern. The formatting improvements enhance readability without altering behavior.


218-349: Streaming tool execution correctly implemented.

The streaming path maintains consistency with the non-streaming approach, properly accumulating tool calls across chunks and executing them in multi-hop fashion. Event emissions correctly identify the provider.

plugins/xai/vision_agents/plugins/xai/llm.py (2)

62-62: LGTM! Provider standardization implemented correctly.

The addition of provider_name = "xai" and the rename to _simple_response align with the cross-provider standardization pattern. The unused participant parameter in the method signature matches the base class interface requirements.

Also applies to: 71-96


148-152: Event emissions correctly formatted.

The multi-line event emission formatting improves readability, and all events consistently include plugin_name="xai" for proper observability tracking.

Also applies to: 184-193, 199-205

observability/grafana/provisioning/dashboards/default.yml (2)

8-8: Verify dashboard deletion setting aligns with PR guidance.

Line 8 sets disableDeletion: false, allowing dashboards to be deleted via the UI. However, the PR description states "Do not modify Grafana directly; changes should be requested via AI." Consider setting this to true to prevent accidental deletions and enforce the file-based workflow.

If the intention is to allow UI modifications only for experimentation (not persistence), apply this diff:

-    disableDeletion: false
+    disableDeletion: true

1-13: Provisioning configuration is well-structured.

The dashboard provider configuration correctly points to the standard Grafana dashboards directory with folder structure preservation enabled. The 10-second update interval provides quick feedback during development.

plugins/wizper/vision_agents/plugins/wizper/stt.py (1)

60-129: LGTM! Implementation follows STT plugin pattern.

The rename to _process_audio is consistent with the base class design. The implementation properly handles temporary file lifecycle with cleanup in a finally block, and error handling emits appropriate events.

examples/01_simple_agent_example/simple_agent_example.py (2)

60-94: Observability stack integration is well-implemented.

The telemetry setup correctly configures OpenTelemetry with Prometheus metrics exposition on port 9464 and OTLP tracing to localhost:4317. The resource identification and graceful shutdown handling via atexit are appropriate.


6-6: Verification complete: Migration from vogent.TurnDetection to smart_turn.TurnDetection is successful.

The verification confirms no references to vogent.TurnDetection remain in the codebase. The file has been correctly updated to import and use smart_turn.TurnDetection() on line 25, with the corresponding import statement on line 6. This aligns with the intended module organization where VogentTurnDetection remains available as a separate plugin while SmartTurnDetection becomes the default in examples.

agents-core/vision_agents/core/tts/tts.py (1)

341-343: Improved metric identification with fully qualified class paths.

The change from self.__class__.__name__ to the fully qualified path (module.qualname) provides better metric disambiguation when multiple TTS providers might share simple class names. This aligns with the observability enhancements throughout the codebase.

observability/grafana/init-home-dashboard.sh (1)

1-33: Script correctly configures home dashboard.

The initialization script properly waits for Grafana health, then sets the Vision Agents dashboard as the default home view for both organization and admin user. The dashboard UID reference aligns with the provisioned dashboard configuration.

agents-core/tests/test_timer.py (1)

18-440: Comprehensive Timer coverage

Love how these tests hammer the context-manager, decorator, inheritance, and async paths—this suite should catch regressions fast.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (6)
plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py (2)

131-146: Consider warning callers when ignored parameters are used.

The method accepts system_prompt, temperature, and max_tokens but silently ignores them. While this appears to be for interface compliance (matching the OpenAI Realtime pattern), callers who pass these parameters may expect them to have an effect.

Consider logging a warning when these parameters are provided:

 async def _simple_response(
     self,
     text: str,
     system_prompt: Optional[str] = None,
     temperature: float = 0.7,
     max_tokens: Optional[int] = None,
     **kwargs: Any,
 ) -> str:
     """
     Internal simple response implementation required by LLM base class.

     Note: Gemini Realtime is event-driven and doesn't return responses directly.
     This implementation sends the text and returns a placeholder.
     """
+    if system_prompt or max_tokens or temperature != 0.7:
+        self.logger.warning(
+            "Parameters system_prompt, temperature, and max_tokens are not supported "
+            "by Gemini Realtime and will be ignored"
+        )
     await self.send_realtime_input(text=text)
     return ""  # Realtime API doesn't return text synchronously

148-164: Consider adding parameter warnings and clarifying the generator pattern.

Similar to _simple_response, this method silently ignores system_prompt, temperature, and max_tokens. Additionally, the return followed by yield pattern (lines 163-164) creates an empty generator, which is technically correct but may be confusing.

Consider adding a warning for ignored parameters and a comment explaining the pattern:

 async def _simple_response_stream(
     self,
     text: str,
     system_prompt: Optional[str] = None,
     temperature: float = 0.7,
     max_tokens: Optional[int] = None,
     **kwargs: Any,
 ):
     """
     Internal simple response stream implementation required by LLM base class.

     Note: Gemini Realtime is event-driven and doesn't stream responses in this manner.
     This implementation sends the text but yields nothing.
     """
+    if system_prompt or max_tokens or temperature != 0.7:
+        self.logger.warning(
+            "Parameters system_prompt, temperature, and max_tokens are not supported "
+            "by Gemini Realtime and will be ignored"
+        )
     await self.send_realtime_input(text=text)
+    # Return immediately to create an empty generator (yield makes this a generator function)
     return
     yield  # Make this a generator
plugins/openai/vision_agents/plugins/openai/openai_realtime.py (1)

138-154: Unreachable yield statement after return.

The yield on line 154 is unreachable because it follows the return statement on line 153. While this still makes the function a generator (which is the intent), the unreachable code is a code smell.

Apply this diff to make the intent clearer:

-        await self.rtc.send_text(text)
-        return
-        yield  # Make this a generator
+        await self.rtc.send_text(text)
+        # Make this a generator that yields nothing
+        if False:
+            yield

Or alternatively:

-        await self.rtc.send_text(text)
-        return
-        yield  # Make this a generator
+        await self.rtc.send_text(text)
+        yield  # Generator that yields nothing

Note: This pattern is also present in the Gemini Realtime implementation (plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py, lines 147-163).

docker-compose.yml (3)

17-17: Pin Grafana and Prometheus to specific versions for reproducibility.

Using :latest tags can lead to non-deterministic behavior across environments and time, potentially breaking observability setups unexpectedly. Pin these images to specific versions for consistent deployments.

-    image: prom/prometheus:latest
+    image: prom/prometheus:v2.52.0
-    image: grafana/grafana:latest
+    image: grafana/grafana:11.2.0

Check Prometheus releases and Grafana releases for appropriate versions to match your requirements.

Also applies to: 35-35


50-51: Add health checks to prevent race conditions during startup.

The depends_on declarations ensure service startup order but do not wait for readiness. The grafana-init service may attempt to initialize dashboards before Grafana is fully ready, and Grafana may start before Prometheus is accepting connections.

Consider adding health checks to ensure dependent services are truly ready before proceeding:

   prometheus:
     image: prom/prometheus:v2.52.0
     ...
+    healthcheck:
+      test: ["CMD", "wget", "--spider", "-q", "http://localhost:9090/-/healthy"]
+      interval: 5s
+      timeout: 3s
+      retries: 5
   grafana:
     ...
     depends_on:
-      - prometheus
+      prometheus:
+        condition: service_healthy
+    healthcheck:
+      test: ["CMD", "wget", "--spider", "-q", "http://localhost:3000/api/health"]
+      interval: 5s
+      timeout: 3s
+      retries: 5
   grafana-init:
     ...
     depends_on:
-      - grafana
+      grafana:
+        condition: service_healthy

Also applies to: 62-63


56-66: Grafana-init script robustness.

The grafana-init service uses a simple curl image to run a shell script. Consider adding error handling and explicit success/failure indicators in the script itself to make debugging easier if initialization fails.

Additionally, ensure the init-home-dashboard.sh script includes proper error handling and exits with appropriate status codes.

Please review the observability/grafana/init-home-dashboard.sh script and confirm it:

  • Exits with a non-zero status on failure
  • Has informative logging for debugging
  • Includes reasonable retry logic or timeout handling for API calls
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 2755b9d and 3646cd0.

📒 Files selected for processing (4)
  • docker-compose.yml (1 hunks)
  • plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py (1 hunks)
  • plugins/openai/vision_agents/plugins/openai/openai_realtime.py (1 hunks)
  • tests/test_function_calling.py (10 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py

📄 CodeRabbit inference engine (.cursor/rules/python.mdc)

**/*.py: Do not modify sys.path in Python code
Docstrings must follow the Google style guide

Files:

  • plugins/openai/vision_agents/plugins/openai/openai_realtime.py
  • tests/test_function_calling.py
  • plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py
tests/**/*.py

📄 CodeRabbit inference engine (.cursor/rules/python.mdc)

tests/**/*.py: Never use mocking utilities (e.g., unittest.mock, pytest-mock) in test files
Write tests using pytest (avoid unittest.TestCase or other frameworks)
Mark integration tests with @pytest.mark.integration
Do not use @pytest.mark.asyncio; async support is automatic

Files:

  • tests/test_function_calling.py
🧬 Code graph analysis (3)
plugins/openai/vision_agents/plugins/openai/openai_realtime.py (3)
plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py (2)
  • _simple_response (131-146)
  • _simple_response_stream (148-164)
plugins/openai/vision_agents/plugins/openai/openai_llm.py (1)
  • _simple_response (81-109)
plugins/openai/vision_agents/plugins/openai/rtc_manager.py (1)
  • send_text (419-440)
tests/test_function_calling.py (3)
agents-core/vision_agents/core/llm/llm.py (1)
  • LLM (46-426)
plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py (2)
  • _simple_response (131-146)
  • _simple_response_stream (148-164)
plugins/openai/vision_agents/plugins/openai/openai_realtime.py (2)
  • _simple_response (121-136)
  • _simple_response_stream (138-154)
plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py (2)
plugins/openai/vision_agents/plugins/openai/openai_realtime.py (2)
  • _simple_response (121-136)
  • _simple_response_stream (138-154)
plugins/gemini/vision_agents/plugins/gemini/gemini_llm.py (1)
  • _simple_response (69-86)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: unit / Test "not integration"
🔇 Additional comments (4)
plugins/openai/vision_agents/plugins/openai/openai_realtime.py (1)

121-136: Implementation correctly aligns with base LLM API.

The method properly delegates to rtc.send_text and returns a placeholder empty string, consistent with the event-driven nature of the Realtime API. The unused parameters (system_prompt, temperature, max_tokens) are accepted for interface compatibility.

tests/test_function_calling.py (2)

16-41: Well-structured test LLM implementation.

The TestLLM class provides appropriate mock implementations of the required abstract methods. The _simple_response returns a mock string, and _simple_response_stream properly yields (not unreachable like in the production code).


162-162: Correctly replaces abstract class instantiation with test implementation.

Replacing LLM() with TestLLM() throughout the tests properly addresses the issue of instantiating an abstract base class. The test implementation provides the required concrete methods.

Also applies to: 176-176, 392-392, 416-416, 435-435, 481-481, 511-511, 544-544

docker-compose.yml (1)

22-22: All referenced configuration files are present and correctly mounted.

Verification confirms that all observability configuration files and scripts referenced in the docker-compose.yml file exist in the repository:

  • observability/prometheus/prometheus.yml
  • observability/grafana/init-home-dashboard.sh
  • observability/grafana/provisioning/
  • observability/grafana/dashboards/

No issues found.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (1)

272-278: Replace magic numbers with defined constants

The calculation uses literals 512 and 16000, but CHUNK and RATE constants are defined at lines 41-42 and used elsewhere in the file (e.g., lines 238, 242, 243).

Apply this diff:

                 trailing_silence_ms = (
                     self._silence.trailing_silence_chunks
-                    * 512
-                    / 16000
+                    * CHUNK
+                    / RATE
                     * 1000
                     * 5  # DTX correction
                 )
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 3646cd0 and f42466c.

📒 Files selected for processing (5)
  • plugins/anthropic/vision_agents/plugins/anthropic/anthropic_llm.py (15 hunks)
  • plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py (2 hunks)
  • plugins/openai/vision_agents/plugins/openai/openai_realtime.py (1 hunks)
  • plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (10 hunks)
  • tests/test_function_calling.py (7 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • plugins/anthropic/vision_agents/plugins/anthropic/anthropic_llm.py
  • plugins/openai/vision_agents/plugins/openai/openai_realtime.py
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py

📄 CodeRabbit inference engine (.cursor/rules/python.mdc)

**/*.py: Do not modify sys.path in Python code
Docstrings must follow the Google style guide

Files:

  • plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py
  • plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py
  • tests/test_function_calling.py
tests/**/*.py

📄 CodeRabbit inference engine (.cursor/rules/python.mdc)

tests/**/*.py: Never use mocking utilities (e.g., unittest.mock, pytest-mock) in test files
Write tests using pytest (avoid unittest.TestCase or other frameworks)
Mark integration tests with @pytest.mark.integration
Do not use @pytest.mark.asyncio; async support is automatic

Files:

  • tests/test_function_calling.py
🧬 Code graph analysis (3)
plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py (4)
plugins/openai/vision_agents/plugins/openai/openai_realtime.py (2)
  • _simple_response (121-137)
  • simple_response (100-119)
plugins/gemini/vision_agents/plugins/gemini/gemini_llm.py (1)
  • _simple_response (69-86)
agents-core/vision_agents/core/llm/llm.py (1)
  • LLMResponseEvent (35-39)
agents-core/vision_agents/core/agents/agents.py (1)
  • simple_response (266-279)
plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (3)
agents-core/vision_agents/core/observability/metrics.py (1)
  • Timer (62-199)
agents-core/vision_agents/core/turn_detection/turn_detection.py (1)
  • detect_turn (72-84)
plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py (4)
  • detect_turn (182-193)
  • _process_audio_packet (216-342)
  • predict_speech (518-546)
  • _predict_turn_completed (405-456)
tests/test_function_calling.py (4)
agents-core/vision_agents/core/llm/llm.py (6)
  • LLM (46-426)
  • LLMResponseEvent (35-39)
  • register_function (218-231)
  • get_available_functions (233-235)
  • _dedup_and_execute (379-413)
  • _run_one_tool (280-352)
agents-core/vision_agents/core/llm/function_registry.py (3)
  • FunctionRegistry (35-237)
  • get_tool_schemas (108-114)
  • get_callable (144-160)
agents-core/vision_agents/core/llm/events.py (2)
  • ToolStartEvent (119-125)
  • ToolEndEvent (129-138)
agents-core/vision_agents/core/events/manager.py (1)
  • wait (470-484)
🔇 Additional comments (7)
tests/test_function_calling.py (1)

16-30: LGTM: Clean test double implementation.

The TestLLM class provides a concrete implementation of the now-abstract LLM base class, which is necessary after _simple_response became abstract. This is a legitimate test double that doesn't rely on mocking frameworks.

plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py (2)

131-147: LGTM! Internal method aligns with Realtime API pattern.

The _simple_response implementation correctly follows the pattern established in openai_realtime.py. Returning an empty LLMResponseEvent is appropriate for event-driven APIs where actual responses arrive through the event system rather than return values.

The inline import of LLMResponseEvent mirrors the OpenAI implementation and likely avoids circular dependencies.


362-364: LGTM! Formatting improvement.

The multi-line formatting improves readability without changing behavior.

plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (4)

16-20: LGTM: Observability imports

The metrics imports are clean and consistent with the broader instrumentation patterns across the codebase.


177-208: Excellent lifecycle management with proper cleanup

The nested try/finally structure elegantly resolves the shutdown race condition from the previous review. The outer finally block (lines 205-208) ensures both _processing_active and _shutdown_event are cleared even when cancellation occurs mid-processing, preventing hung waits and enabling clean restart cycles.


252-256: LGTM: VAD latency instrumentation

The Timer wrapping captures essential context (samples, implementation) and follows the established observability pattern seen in vogent.


294-303: LGTM: Comprehensive turn-end detection metrics

The Timer captures rich context (audio duration, samples, trailing silence, prediction probability, and turn-end decision), enabling effective observability for turn detection behavior.

"""

import pytest
from unittest.mock import Mock, patch
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Remove unittest.mock usage per coding guidelines.

This file violates the coding guideline: "Never use mocking utilities (e.g., unittest.mock, pytest-mock) in test files" for tests/**/*.py. The Mock and patch imports are used extensively in TestOpenAIFunctionCalling, TestClaudeFunctionCalling, and TestGeminiFunctionCalling classes.

Consider rewriting these tests without mocking frameworks—use real instances with test configurations, in-memory fakes, or integration test markers if external dependencies are required.

As per coding guidelines

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py (1)

182-193: Update call sites in vogent plugin to use detect_turn method.

The refactoring is incomplete. Tests and examples within the vogent plugin still call the old process_audio method, which will cause failures:

  • plugins/vogent/tests/test_vogent_td.py lines 45, 46
  • plugins/vogent/example/basic_usage.py lines 90, 166
🧹 Nitpick comments (4)
plugins/aws/vision_agents/plugins/aws/aws_llm.py (1)

164-168: Consider extracting duplicated constants.

The max_concurrency=8 and timeout_s=30 parameters appear identically in both tool execution loops (lines 164-168 and 408-412).

Apply this diff to extract the constants:

     def __init__(
         self,
         model: str,
         region_name: str = "us-east-1",
         aws_access_key_id: Optional[str] = None,
         aws_secret_access_key: Optional[str] = None,
         aws_session_token: Optional[str] = None,
     ):
         """
         Initialize the BedrockLLM class.

         Args:
             model: The Bedrock model ID (e.g., "anthropic.claude-3-5-sonnet-20241022-v2:0")
             region_name: AWS region name (default: "us-east-1")
             aws_access_key_id: Optional AWS access key ID
             aws_secret_access_key: Optional AWS secret access key
             aws_session_token: Optional AWS session token
         """
         super().__init__()
         self.events.register_events_from_module(events)
         self.model = model
         self._pending_tool_uses_by_index: Dict[int, Dict[str, Any]] = {}
+        self._tool_max_concurrency = 8
+        self._tool_timeout_s = 30

Then reference these constants in both locations:

                 while current_calls and rounds < MAX_ROUNDS:
                     # Execute calls concurrently with dedup
                     triples, seen = await self._dedup_and_execute(
                         cast(List[Dict[str, Any]], current_calls),
                         seen=seen,
-                        max_concurrency=8,
-                        timeout_s=30,
+                        max_concurrency=self._tool_max_concurrency,
+                        timeout_s=self._tool_timeout_s,
                     )

And similarly at lines 408-412.

Also applies to: 408-412

plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py (3)

149-153: Simplify model initialization to match smart_turn pattern.

The lambda wrappers around model constructors add unnecessary complexity. The asyncio.to_thread function accepts any callable, so you can pass constructors directly as shown in the smart_turn plugin.

Apply this diff to simplify _prepare_silero_vad:

-        self.vad = await asyncio.to_thread(  # type: ignore[func-returns-value]
-            lambda: SileroVAD(  # type: ignore[arg-type]
-                path, reset_interval_seconds=self.vad_reset_interval_seconds
-            )
+        self.vad = await asyncio.to_thread(
+            SileroVAD, path, reset_interval_seconds=self.vad_reset_interval_seconds
         )

Apply similar changes to _prepare_whisper:

-        self.whisper = await asyncio.to_thread(  # type: ignore[func-returns-value]
-            lambda: WhisperModel(
-                self.whisper_model_size, device="cpu", compute_type="int8"
-            )
+        self.whisper = await asyncio.to_thread(
+            WhisperModel, self.whisper_model_size, device="cpu", compute_type="int8"
         )

And to _prepare_vogent:

-        self.vogent = await asyncio.to_thread(  # type: ignore[func-returns-value]
-            lambda: VogentDetector(
-                compile_model=True,
-                warmup=True,
-                device=None,
-                model_name="vogent/Vogent-Turn-80M",
-                revision="main",
-            )
+        self.vogent = await asyncio.to_thread(
+            VogentDetector,
+            compile_model=True,
+            warmup=True,
+            device=None,
+            model_name="vogent/Vogent-Turn-80M",
+            revision="main",
         )

Note: The type: ignore comments may become unnecessary after this refactor. If they remain, investigate the underlying type issues rather than suppressing them.

Also applies to: 159-162, 171-179


376-403: Consider checking self.whisper before starting the timer.

The Timer currently records latency even when self.whisper is None and an empty string is returned immediately. This could skew your metrics if the early-exit path is taken frequently.

Apply this diff to avoid recording no-op latency:

     async def _transcribe_segment(self, pcm: PcmData) -> str:
         """
         Transcribe audio segment using faster-whisper.
 
         Args:
             pcm: PcmData containing audio samples
 
         Returns:
             Transcribed text
         """
+        if self.whisper is None:
+            return ""
+
         with Timer(vogent_whisper_latency_ms) as timer:
             # Ensure it's 16khz and f32 format
             pcm = pcm.resample(16000).to_float32()
             audio_array = pcm.samples
             timer.attributes["audio_duration_ms"] = pcm.duration_ms
             timer.attributes["samples"] = len(audio_array)
 
-            if self.whisper is None:
-                return ""
-
             # Run transcription in thread pool to avoid blocking
             segments, info = await asyncio.to_thread(

422-456: Consider checking self.vogent before starting the timer.

Similar to the transcription method, the Timer records latency even when self.vogent is None and False is returned immediately. Moving the guard clause outside the Timer context would provide more accurate metrics.

Apply this diff:

     async def _predict_turn_completed(
         self,
         pcm: PcmData,
         prev_line: str,
         curr_line: str,
     ) -> bool:
         """
         Predict whether the current turn is complete using Vogent.
 
         Args:
             pcm: PcmData containing audio samples
             prev_line: Previous speaker's text (for context)
             curr_line: Current speaker's text
 
         Returns:
             True if turn is complete, False otherwise
         """
+        if self.vogent is None:
+            return False
+
         with Timer(turn_end_detection_latency_ms) as timer:
             # Ensure it's 16khz and f32 format
             pcm = pcm.resample(16000).to_float32()
 
             # Truncate to 8 seconds
             audio_array = pcm.tail(8, False).samples
             timer.attributes["implementation"] = "vogent"
             timer.attributes["audio_duration_ms"] = len(audio_array) / 16000 * 1000
             timer.attributes["prev_line_length"] = len(prev_line)
             timer.attributes["curr_line_length"] = len(curr_line)
 
-            if self.vogent is None:
-                return False
-
             # Run vogent prediction in thread pool
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 06aea31 and 8869d65.

📒 Files selected for processing (3)
  • plugins/aws/vision_agents/plugins/aws/aws_llm.py (3 hunks)
  • plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (10 hunks)
  • plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py (15 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (.cursor/rules/python.mdc)

**/*.py: Do not modify sys.path in Python code
Docstrings must follow the Google style guide

Files:

  • plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py
  • plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py
  • plugins/aws/vision_agents/plugins/aws/aws_llm.py
🧬 Code graph analysis (3)
plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (2)
agents-core/vision_agents/core/observability/metrics.py (1)
  • Timer (62-199)
plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py (4)
  • detect_turn (182-193)
  • _process_audio_packet (216-342)
  • predict_speech (518-546)
  • _predict_turn_completed (405-456)
plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py (3)
agents-core/vision_agents/core/observability/metrics.py (1)
  • Timer (62-199)
agents-core/vision_agents/core/utils/utils.py (1)
  • ensure_model (115-157)
plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (3)
  • _prepare_silero_vad (151-157)
  • SileroVAD (429-496)
  • detect_turn (159-171)
plugins/aws/vision_agents/plugins/aws/aws_llm.py (2)
plugins/gemini/vision_agents/plugins/gemini/gemini_llm.py (1)
  • _simple_response (69-86)
agents-core/vision_agents/core/llm/llm.py (2)
  • _simple_response (88-98)
  • LLMResponseEvent (35-39)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: unit / Ruff & mypy
  • GitHub Check: unit / Test "not integration"
  • GitHub Check: unit / Ruff & mypy
  • GitHub Check: unit / Test "not integration"
🔇 Additional comments (6)
plugins/aws/vision_agents/plugins/aws/aws_llm.py (1)

82-87: LGTM - Method signature aligns with the refactoring pattern.

The change from public simple_response to private _simple_response with explicit return type annotation is consistent with the broader refactoring across LLM implementations. The signature correctly matches the base class abstract method.

plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py (3)

19-24: LGTM! Clean metrics imports and definition.

The observability imports are well-organized, and the new vogent_whisper_latency_ms histogram follows proper naming conventions with clear metadata.

Also applies to: 31-36


259-264: LGTM! VAD instrumentation follows best practices.

The Timer usage correctly wraps the VAD prediction, and the attributes provide useful context for performance analysis (sample count and implementation type).


476-476: LGTM! Punctuation stripping improves model input.

Using rstrip(".!?") to remove terminal punctuation before passing text to Vogent is a sensible preprocessing step that should improve turn detection accuracy.

plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (2)

178-210: Lifecycle cleanup looks solid.

Resetting _processing_active and _shutdown_event in the shutdown path prevents the cancellation hang we saw earlier and keeps the detector restartable.


253-305: Telemetry instrumentation is on point.

Wrapping the VAD and turn-completion calls in Timer gives us per-call latency with helpful context attributes, which will drop straight into the new dashboards.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
plugins/openai/vision_agents/plugins/openai/openai_realtime.py (1)

108-127: Overriding the public method bypasses all metrics tracking.

This override completely defeats the observability instrumentation introduced in this PR. The base class simple_response (lines 72-94 in llm.py) wraps _simple_response with histogram tracking for llm_latency_ms and error attribution. By overriding the public method here, external calls bypass that wrapper entirely—no latency is recorded, no provider attribute is set, and exceptions are not tagged.

Additionally, the method lacks a return type annotation and implicitly returns None, violating the base class signature -> LLMResponseEvent[Any].

Solution:
Remove this override entirely and consolidate the implementation into _simple_response below. Let the base class handle the public entry point so metrics are captured for every call.

Apply this diff:

-    async def simple_response(
-        self,
-        text: str,
-        processors: Optional[List[Processor]] = None,
-        participant: Optional[Participant] = None,
-    ):
-        """Send a simple text input to the OpenAI Realtime session.
-
-        This is a convenience wrapper that forwards a text prompt upstream via
-        the underlying realtime connection. It does not stream partial deltas
-        back; callers should subscribe to the provider's events to receive
-        responses.
-
-        Args:
-            text: Text prompt to send.
-            processors: Optional processors list (not used here; included for
-                interface parity with the core `LLM` API).
-            participant: Optional participant metadata (ignored here).
-        """
-        await self.rtc.send_text(text)
-
     async def _simple_response(
         self,
         text: str,
         processors: Optional[List[Processor]] = None,
         participant: Optional[Participant] = None,
     ):
-        """
-        Internal simple response implementation required by LLM base class.
-
-        Note: OpenAI Realtime is event-driven and doesn't return responses directly.
-        This implementation sends the text via the public simple_response method.
-        """
+        """Send a simple text input to the OpenAI Realtime session.
+
+        Internal implementation required by LLM base class. OpenAI Realtime is 
+        event-driven and doesn't return responses directly; callers should 
+        subscribe to the provider's events to receive responses.
+
+        Args:
+            text: Text prompt to send.
+            processors: Optional processors list (not used here; included for
+                interface parity with the core `LLM` API).
+            participant: Optional participant metadata (ignored here).
+
+        Returns:
+            Empty LLMResponseEvent since Realtime API is event-driven.
+        """
         from vision_agents.core.llm.llm import LLMResponseEvent
 
-        await self.simple_response(text, processors, participant)
-        # Return empty LLMResponseEvent since Realtime API is event-driven
+        await self.rtc.send_text(text)
         return LLMResponseEvent(original=None, text="")
plugins/fish/vision_agents/plugins/fish/stt.py (1)

134-140: Restore _emit_error_event before re-raising.

The new STT.process_audio wrapper no longer emits errors on behalf of plugins. As a result, this raise now bypasses both the stt_errors counter and STTErrorEvent emission, so Fish STT errors stop surfacing in metrics/streams. Call _emit_error_event (with the participant context) before re-raising.

Apply this diff:

         except Exception as e:
             logger.error(
                 "Error during Fish Audio transcription",
                 exc_info=e,
             )
-            # Let the base class handle error emission
-            raise
+            self._emit_error_event(e, "Fish Audio transcription", participant)
+            raise
♻️ Duplicate comments (2)
plugins/ultralytics/vision_agents/plugins/ultralytics/yolo_pose_processor.py (1)

268-270: Fix persons_detected metric to count actual detections.

This still records len(pose_results), i.e., batch entries (typically 1 per frame) rather than the number of detected people. The metric remains wrong and never reflects multiple detections or zero detections. Please sum the detections across the results before setting the metric.

Apply this diff:

-            timer.attributes["persons_detected"] = (
-                len(pose_results) if pose_results else 0
-            )
+            persons_detected = sum(
+                len(result.keypoints.data)
+                for result in pose_results
+                if getattr(result, "keypoints", None) is not None
+            )
+            timer.attributes["persons_detected"] = persons_detected
agents-core/vision_agents/core/llm/llm.py (1)

82-95: Increment the error counter when _simple_response fails.

The exception handler sets the timer's error attribute but never increments the llm_errors counter that the Grafana dashboard expects. This issue was flagged in a previous review and remains unresolved.

The import needs to include llm_errors, and the except block must call the counter's add method before re-raising:

-        from vision_agents.core.observability.metrics import Timer, llm_latency_ms
+        from vision_agents.core.observability.metrics import (
+            Timer,
+            llm_latency_ms,
+            llm_errors,
+        )
             except Exception as e:
                 timer.attributes["error"] = type(e).__name__
+                llm_errors.add(
+                    1,
+                    {
+                        "provider": timer.attributes["provider"],
+                        "error": type(e).__name__,
+                        "llm_class": timer.attributes["llm_class"],
+                    },
+                )
                 raise
🧹 Nitpick comments (2)
agents-core/vision_agents/core/turn_detection/turn_detection.py (1)

82-94: Consider clarifying the docstring to distinguish from the wrapper.

The docstring is identical to process_audio above, but this is the hook that implementations override. Adding a note like "Override this method in subclasses to implement turn detection logic" would help developers understand the wrapper pattern.

Apply this diff to clarify the distinction:

     @abstractmethod
     async def detect_turn(
         self,
         audio_data: PcmData,
         participant: Participant,
         conversation: Optional[Conversation],
     ) -> None:
-        """Process the audio and trigger turn start or turn end events
+        """Implement turn detection logic for the given audio data.
+
+        Override this method in subclasses to implement turn detection.
+        The audio processing and metrics are automatically handled by process_audio.
 
         Args:
             audio_data: PcmData object containing audio samples from Stream
             participant: Participant that's speaking, includes user data
             conversation: Transcription/ chat history, sometimes useful for turn detection
         """
plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (1)

273-279: Consider grouping operations for readability.

The multi-line arithmetic without visual grouping makes the calculation harder to parse. The DTX correction multiplier is significant and the formatting obscures the operation precedence.

Apply this diff to improve clarity:

-                trailing_silence_ms = (
-                    self._silence.trailing_silence_chunks
-                    * 512
-                    / 16000
-                    * 1000
-                    * 5  # DTX correction
-                )
+                trailing_silence_ms = (
+                    (self._silence.trailing_silence_chunks * 512 / 16000 * 1000)
+                    * 5  # DTX correction factor
+                )
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 8869d65 and c550070.

📒 Files selected for processing (13)
  • agents-core/vision_agents/core/llm/llm.py (1 hunks)
  • agents-core/vision_agents/core/stt/stt.py (6 hunks)
  • agents-core/vision_agents/core/tts/tts.py (3 hunks)
  • agents-core/vision_agents/core/turn_detection/turn_detection.py (4 hunks)
  • examples/01_simple_agent_example/simple_agent_example.py (2 hunks)
  • plugins/aws/vision_agents/plugins/aws/aws_llm.py (1 hunks)
  • plugins/fish/vision_agents/plugins/fish/stt.py (2 hunks)
  • plugins/gemini/vision_agents/plugins/gemini/gemini_llm.py (3 hunks)
  • plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py (1 hunks)
  • plugins/openai/vision_agents/plugins/openai/openai_realtime.py (1 hunks)
  • plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (10 hunks)
  • plugins/ultralytics/vision_agents/plugins/ultralytics/yolo_pose_processor.py (3 hunks)
  • plugins/wizper/vision_agents/plugins/wizper/stt.py (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • agents-core/vision_agents/core/tts/tts.py
  • plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py
  • examples/01_simple_agent_example/simple_agent_example.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (.cursor/rules/python.mdc)

**/*.py: Do not modify sys.path in Python code
Docstrings must follow the Google style guide

Files:

  • agents-core/vision_agents/core/stt/stt.py
  • plugins/aws/vision_agents/plugins/aws/aws_llm.py
  • plugins/openai/vision_agents/plugins/openai/openai_realtime.py
  • agents-core/vision_agents/core/turn_detection/turn_detection.py
  • plugins/gemini/vision_agents/plugins/gemini/gemini_llm.py
  • plugins/ultralytics/vision_agents/plugins/ultralytics/yolo_pose_processor.py
  • plugins/wizper/vision_agents/plugins/wizper/stt.py
  • agents-core/vision_agents/core/llm/llm.py
  • plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py
  • plugins/fish/vision_agents/plugins/fish/stt.py
🧬 Code graph analysis (10)
agents-core/vision_agents/core/stt/stt.py (7)
agents-core/vision_agents/core/observability/metrics.py (1)
  • Timer (62-199)
agents-core/vision_agents/core/events/manager.py (1)
  • send (428-472)
agents-core/vision_agents/core/tts/tts.py (1)
  • send (202-302)
agents-core/vision_agents/core/stt/events.py (3)
  • STTTranscriptEvent (16-47)
  • STTPartialTranscriptEvent (51-78)
  • STTErrorEvent (82-94)
plugins/fish/vision_agents/plugins/fish/stt.py (1)
  • _process_audio (53-140)
plugins/wizper/vision_agents/plugins/wizper/stt.py (1)
  • _process_audio (62-139)
plugins/deepgram/vision_agents/plugins/deepgram/deepgram_stt.py (1)
  • _process_audio (78-114)
plugins/aws/vision_agents/plugins/aws/aws_llm.py (1)
agents-core/vision_agents/core/llm/llm.py (2)
  • _simple_response (98-108)
  • LLMResponseEvent (38-42)
plugins/openai/vision_agents/plugins/openai/openai_realtime.py (5)
plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py (2)
  • _simple_response (136-152)
  • simple_response (119-134)
agents-core/vision_agents/core/llm/llm.py (3)
  • _simple_response (98-108)
  • LLMResponseEvent (38-42)
  • simple_response (73-95)
agents-core/vision_agents/core/edge/sfu_events.py (11)
  • participant (1496-1501)
  • participant (1504-1507)
  • participant (1545-1550)
  • participant (1553-1556)
  • participant (1625-1630)
  • participant (1633-1636)
  • participant (2100-2105)
  • participant (2108-2111)
  • participant (2156-2161)
  • participant (2164-2167)
  • Participant (229-270)
agents-core/vision_agents/core/edge/types.py (1)
  • Participant (22-24)
agents-core/vision_agents/core/agents/agents.py (1)
  • simple_response (292-305)
agents-core/vision_agents/core/turn_detection/turn_detection.py (6)
agents-core/vision_agents/core/observability/metrics.py (1)
  • Timer (62-199)
agents-core/vision_agents/core/turn_detection/events.py (2)
  • TurnStartedEvent (11-25)
  • TurnEndedEvent (29-45)
agents-core/vision_agents/core/events/manager.py (1)
  • send (428-472)
plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (1)
  • detect_turn (159-171)
plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py (1)
  • detect_turn (182-193)
plugins/krisp/vision_agents/plugins/krisp/turn_detection.py (1)
  • detect_turn (78-119)
plugins/gemini/vision_agents/plugins/gemini/gemini_llm.py (2)
plugins/aws/vision_agents/plugins/aws/aws_llm.py (1)
  • _simple_response (93-113)
agents-core/vision_agents/core/llm/llm.py (2)
  • _simple_response (98-108)
  • LLMResponseEvent (38-42)
plugins/ultralytics/vision_agents/plugins/ultralytics/yolo_pose_processor.py (1)
agents-core/vision_agents/core/observability/metrics.py (1)
  • Timer (62-199)
plugins/wizper/vision_agents/plugins/wizper/stt.py (2)
agents-core/vision_agents/core/stt/stt.py (1)
  • _process_audio (161-176)
agents-core/vision_agents/core/edge/types.py (1)
  • Participant (22-24)
agents-core/vision_agents/core/llm/llm.py (3)
agents-core/vision_agents/core/observability/metrics.py (1)
  • Timer (62-199)
agents-core/vision_agents/core/processors/base_processor.py (1)
  • Processor (35-43)
agents-core/vision_agents/core/edge/types.py (1)
  • Participant (22-24)
plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (3)
agents-core/vision_agents/core/observability/metrics.py (1)
  • Timer (62-199)
agents-core/vision_agents/core/turn_detection/turn_detection.py (1)
  • detect_turn (82-94)
plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py (5)
  • detect_turn (182-193)
  • _process_audio_packet (216-342)
  • predict_speech (518-546)
  • _predict_turn_completed (405-456)
  • SileroVAD (483-546)
plugins/fish/vision_agents/plugins/fish/stt.py (3)
agents-core/vision_agents/core/stt/stt.py (1)
  • _process_audio (161-176)
plugins/wizper/vision_agents/plugins/wizper/stt.py (1)
  • _process_audio (62-139)
plugins/deepgram/vision_agents/plugins/deepgram/deepgram_stt.py (1)
  • _process_audio (78-114)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: unit / Ruff & mypy
  • GitHub Check: unit / Test "not integration"
  • GitHub Check: unit / Test "not integration"
  • GitHub Check: unit / Ruff & mypy
🔇 Additional comments (13)
plugins/aws/vision_agents/plugins/aws/aws_llm.py (1)

93-113: LGTM!

The refactoring to _simple_response aligns with the framework's new metrics-wrapped pattern. The delegation to converse_stream is appropriate, and the return type annotation improves type safety.

plugins/gemini/vision_agents/plugins/gemini/gemini_llm.py (3)

62-62: LGTM!

The provider_name attribute enables proper metrics tagging in the base class wrapper (line 88 of llm.py). The value correctly identifies the Gemini provider.


69-86: LGTM!

The _simple_response implementation correctly matches the base class signature and appropriately delegates to the provider-specific send_message method.


88-88: LGTM!

The method signature correction to async def aligns with its actual async implementation (uses await throughout). This improves type correctness without breaking existing callers.

agents-core/vision_agents/core/llm/llm.py (1)

97-108: LGTM!

The abstract _simple_response method establishes a clear contract for provider implementations. The signature is consistent across all providers reviewed (AWS, Gemini), and the decorator properly enforces implementation in subclasses.

agents-core/vision_agents/core/turn_detection/turn_detection.py (2)

11-11: LGTM! Observability imports properly added.

The import of Timer and metrics objects supports the instrumentation pattern introduced in this PR.


67-79: Well-implemented instrumentation wrapper with error tracking.

The Timer context manager correctly records latency with class and provider attributes, and the exception handling properly increments error metrics before re-raising. This pattern effectively separates observability concerns from implementation logic.

plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (6)

15-19: LGTM! Observability metrics imported.

The imports support the Timer-based instrumentation added to VAD and turn completion prediction paths.


159-171: Correctly renamed to conform to base class pattern.

The method now implements the abstract detect_turn hook introduced in the base TurnDetector class, with process_audio serving as the instrumentation wrapper.


178-209: Excellent lifecycle management with proper cleanup guarantees.

The structured try/except/finally ensures _processing_active is cleared when the queue empties and both flags are reset on shutdown, even during cancellation. This directly addresses the past review concern about lifecycle transitions hanging.


253-257: Timer instrumentation captures VAD metrics effectively.

Recording sample count and implementation type provides useful observability into the VAD prediction path.


295-304: Comprehensive turn completion instrumentation.

The Timer captures duration, sample count, trailing silence, prediction value, and the derived turn_ended boolean—excellent coverage for debugging and performance analysis.


432-450: Flexible model loading with optional pre-loaded bytes.

The model_bytes parameter enables future optimizations (e.g., loading models once and sharing across instances) while maintaining backward compatibility through the default None fallback to file-based loading.

Comment on lines +147 to +153
timer.attributes["sample_rate"] = pcm_data.sample_rate
timer.attributes["channels"] = pcm_data.channels
timer.attributes["samples"] = (
len(pcm_data.samples) if pcm_data.samples is not None else 0
)
timer.attributes["duration_ms"] = pcm_data.duration_ms

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Confirm PcmData exposes duration_ms.

getstream.video.rtc.track_util.PcmData has historically exposed .duration, not .duration_ms. If duration_ms is missing, this attribute access will raise and the metrics wrapper will crash before the plugin runs. Please verify the attribute exists (or derive the duration in milliseconds defensively).


🏁 Script executed:

#!/bin/bash
# Inspect PcmData for a duration_ms attribute.
rg -n "duration_ms" -g 'track_util.py' -C3

Length of output: 148


I need to search more broadly for the PcmData class definition. Let me try a different approach:


---

🏁 Script executed:

```shell
#!/bin/bash
# Find track_util.py files in the repository
fd -t f "track_util.py"

Length of output: 0


🏁 Script executed:

#!/bin/bash
# Search for PcmData class definition
rg -n "class PcmData" -A 20 -B 2

Length of output: 0


🏁 Script executed:

#!/bin/bash
# Search for duration_ms usage in the stt.py file
rg -n "duration_ms" agents-core/vision_agents/core/stt/stt.py -C 5

Length of output: 552


🏁 Script executed:

#!/bin/bash
# Search for PcmData attribute access patterns
rg -n "pcm_data\." agents-core/vision_agents/core/stt/stt.py -C 2

Length of output: 463


🏁 Script executed:

#!/bin/bash
# Look for how duration is referenced elsewhere with pcm_data
rg -n "pcm_data.*duration" -C 3

Length of output: 1655


🏁 Script executed:

#!/bin/bash
# Search for PcmData class or dataclass definition anywhere
rg -n "PcmData" -B 2 -A 10 --type py | head -60

Length of output: 3286


🏁 Script executed:

#!/bin/bash
# Check imports in the stt.py file to understand where PcmData comes from
rg -n "import|from" agents-core/vision_agents/core/stt/stt.py | head -30

Length of output: 451


🏁 Script executed:

#!/bin/bash
# Search for @property definitions related to duration
rg -n "@property" -A 3 | grep -A 3 "duration"

Length of output: 804


🏁 Script executed:

#!/bin/bash
# Look for duration conversion or duration_ms definition
rg -n "duration_ms" -B 5 -A 2 --type py

Length of output: 41672


🏁 Script executed:

#!/bin/bash
# Search for any @property or method definitions that might create duration_ms
rg -n "def duration" --type py -B 2 -A 5

Length of output: 0


🏁 Script executed:

#!/bin/bash
# Search for any wrapper classes or extensions of PcmData in the codebase
rg -n "class.*PcmData|extends.*PcmData" --type py -B 2 -A 10

Length of output: 770


🏁 Script executed:

#!/bin/bash
# Check if there's a getstream library reference or version specification
fd -t f "requirements*.txt" -o "pyproject.toml" -o "setup.py" -o "poetry.lock" | head -10 | xargs grep -l "getstream" 2>/dev/null

Length of output: 147


🏁 Script executed:

#!/bin/bash
# Look at imports and see if PcmData is used elsewhere or extended
rg -n "from getstream" --type py

Length of output: 10670


🏁 Script executed:

#!/bin/bash
# Find dependency files to check getstream version
find . -maxdepth 3 -type f \( -name "pyproject.toml" -o -name "requirements*.txt" -o -name "setup.py" \) | head -20

Length of output: 701


🏁 Script executed:

#!/bin/bash
# Search for any dynamic property assignment or getattr usage with duration
rg -n "duration_ms|setattr|__getattr__" --type py -B 3 -A 3 | grep -A 5 -B 5 "PcmData"

Length of output: 3250


🏁 Script executed:

#!/bin/bash
# Look at the actual test to understand what PcmData.from_bytes returns
rg -n "from_bytes" agents-core/ --type py -A 10

Length of output: 966


🏁 Script executed:

#!/bin/bash
# Check the getstream dependency version in agents-core
cat ./agents-core/pyproject.toml | grep -A 20 "dependencies"

Length of output: 1241


🏁 Script executed:

#!/bin/bash
# Also check if there's a poetry.lock or requirements file that pins the version
cat ./agents-core/pyproject.toml | grep -i "getstream"

Length of output: 332


🌐 Web query:

getstream PcmData duration_ms attribute python

💡 Result:

GetStream's PcmData (Python) exposes duration in seconds as .duration — there isn't a documented .duration_ms attribute. Compute milliseconds either from .duration or from samples/sample_rate:

  • Using .duration: duration_ms = pcm.duration * 1000. (getstream.io)

  • From samples: duration_ms = (len(pcm.samples) / pcm.sample_rate) * 1000 (adjust for channels/bytes-per-sample if needed). (getstream.io)

If you want, tell me which PcmData fields you see and I’ll give the exact one-line code.

Citations:


pcm_data.duration_ms does not exist on GetStream's PcmData class; use pcm_data.duration * 1000 instead.

GetStream's PcmData exposes duration in seconds as .duration, not .duration_ms. Line 152 will crash with AttributeError when the metrics wrapper accesses this attribute. Compute milliseconds defensively:

timer.attributes["duration_ms"] = pcm_data.duration * 1000 if pcm_data.duration is not None else 0

This same pattern appears elsewhere in the codebase (e.g., agents-core/vision_agents/core/vad/vad.py, agents-core/vision_agents/core/tts/tts.py, plugins/), all accessing the non-existent .duration_ms attribute.

🤖 Prompt for AI Agents
In agents-core/vision_agents/core/stt/stt.py around lines 147 to 153, the code
reads a non-existent pcm_data.duration_ms attribute which will raise
AttributeError; replace it with a defensive computation using pcm_data.duration
* 1000 when duration is present, otherwise 0, and apply the same defensive
pattern where else duration_ms is used in the repo; update
timer.attributes["duration_ms"] to compute milliseconds from pcm_data.duration
(guarding against None).

Comment on lines +129 to +145
async def _simple_response(
self,
text: str,
processors: Optional[List[Processor]] = None,
participant: Optional[Participant] = None,
):
"""
Internal simple response implementation required by LLM base class.
Note: OpenAI Realtime is event-driven and doesn't return responses directly.
This implementation sends the text via the public simple_response method.
"""
from vision_agents.core.llm.llm import LLMResponseEvent

await self.simple_response(text, processors, participant)
# Return empty LLMResponseEvent since Realtime API is event-driven
return LLMResponseEvent(original=None, text="")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Implementation will be correct once the public override is removed.

After removing the public simple_response override as suggested above, this internal implementation will properly integrate with the base class metrics wrapper. The pattern of returning an empty LLMResponseEvent correctly reflects the event-driven nature of the Realtime API.

Note: The same bypass issue exists in the Gemini plugin (plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py, lines 118-151). Please verify and apply the same fix across all Realtime implementations to ensure consistent metrics coverage.


🏁 Script executed:

#!/bin/bash
# Description: Find all Realtime plugin implementations that override simple_response
# to ensure metrics bypass is fixed consistently across providers.

# Search for classes that override simple_response (public method)
ast-grep --pattern $'class $_(realtime.Realtime):
  $$$
  async def simple_response($$$):
    $$$
  $$$'

Length of output: 196547


Remove public simple_response override to ensure metrics coverage across all Realtime implementations.

The _simple_response implementation is correctly structured. However, to achieve the goal, the public simple_response method override (lines 108-127) must be removed. This allows the base class metrics wrapper to properly invoke _simple_response and track all interactions.

Verified: This same pattern exists across all Realtime plugin implementations:

  • Gemini Realtime has an identical public override (lines 119-134) that should be removed
  • AWS Realtime also has a public override (lines 271-286) and is missing a _simple_response implementation entirely

All three providers need the same structural fix to ensure consistent metrics coverage.

🤖 Prompt for AI Agents
In plugins/openai/vision_agents/plugins/openai/openai_realtime.py around lines
129-145 (and remove the public simple_response override at lines ~108-127),
delete the public simple_response override so the base class metrics wrapper can
call the protected _simple_response; keep the existing async _simple_response
implementation as-is (which sends via simple_response internally) and ensure no
public override shadows the base implementation so metrics are recorded
properly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants