Skip to content

feat(ollama): add meter STTG to ollama instrumentation#3053

Merged
nirga merged 23 commits intotraceloop:mainfrom
minimAluminiumalism:main
Jun 28, 2025
Merged

feat(ollama): add meter STTG to ollama instrumentation#3053
nirga merged 23 commits intotraceloop:mainfrom
minimAluminiumalism:main

Conversation

@minimAluminiumalism
Copy link
Contributor

@minimAluminiumalism minimAluminiumalism commented Jun 28, 2025

  • I have added tests that cover my changes.
  • If adding a new instrumentation or changing an existing one, I've added screenshots from some observability platform showing the change.
  • PR name follows conventional commits format: feat(instrumentation): ... or fix(instrumentation): ....
  • (If applicable) I have updated the documentation accordingly.

fixes #3048


Important

Adds streaming_time_to_generate metric to Ollama instrumentation for measuring time from first token to completion in streaming responses, with corresponding tests and updates.

  • Behavior:
    • Adds streaming_time_to_generate metric to measure time from first token to completion in streaming responses in __init__.py.
    • Updates _accumulate_streaming_response() and _aaccumulate_streaming_response() to record streaming_time_to_generate.
    • Modifies _wrap() and _awrap() to handle new metric.
  • Tests:
    • Adds test_ollama_streaming_time_to_generate_metrics() in test_ollama_metrics.py to verify new metric.
    • Includes VCR cassette test_ollama_streaming_time_to_generate_metrics.yaml for HTTP interaction recording.
  • Misc:
    • Updates Meters.LLM_STREAMING_TIME_TO_GENERATE in semconv_ai/__init__.py to reflect new metric naming.

This description was created by Ellipsis for e96286b. You can customize this summary. It will automatically update as commits are pushed.

minimAluminiumalism and others added 21 commits April 17, 2025 16:24
Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Changes requested ❌

Reviewed everything up to e96286b in 1 minute and 15 seconds. Click for details.
  • Reviewed 395 lines of code in 4 files
  • Skipped 0 files when reviewing.
  • Skipped posting 2 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/opentelemetry-instrumentation-ollama/opentelemetry/instrumentation/ollama/__init__.py:291
  • Draft comment:
    Consider capturing the final response explicitly instead of relying on the if 'res' in locals() check for obtaining the model attribute. This would improve clarity and maintainability.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 0% vs. threshold = 50% The comment identifies a real code clarity issue. Using locals() to check variable existence is a bit hacky. However, the suggested fix is wrong - simply removing the check would cause errors if the loop never executes, since 'res' wouldn't be defined. The code needs the safety check. The current implementation, while not ideal, is actually doing the right thing functionally. The comment correctly identifies a code clarity issue but proposes a solution that would introduce bugs. There may be better ways to handle this edge case than using locals(). While the code could be clearer, the current implementation is actually correct and handles edge cases properly. The suggested change would break functionality. The comment should be deleted because its suggested fix would introduce bugs, even though it correctly identifies a clarity issue.
2. packages/opentelemetry-semantic-conventions-ai/opentelemetry/semconv_ai/__init__.py:11
  • Draft comment:
    Verify that the updated metric name 'llm.chat_completions.streaming_time_to_generate' aligns with the intended semantic conventions for Ollama instrumentation. The prefix change from 'llm.openai...' may be intentional but warrants confirmation.
  • Reason this comment was not posted:
    Comment did not seem useful. Confidence is useful = 0% <= threshold 50% The comment is asking the PR author to confirm their intention regarding the metric name change, which violates the rule against asking for confirmation of intention. It does not provide a specific code suggestion or ask for a specific test to be written.

Workflow ID: wflow_toUpoPR3hoAbh7XH

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Metric Dependency Issue in Streaming Code

The streaming_time_to_generate metric is not recorded when streaming_time_to_first_token is disabled. This occurs because first_token_time, which is required for streaming_time_to_generate, is only set if streaming_time_to_first_token is enabled and start_time is not None. This creates an unintended dependency between these two independent streaming metrics, affecting both synchronous and asynchronous code paths. first_token_time should be set if either metric is enabled.

packages/opentelemetry-instrumentation-ollama/opentelemetry/instrumentation/ollama/__init__.py#L277-L303

if first_token and streaming_time_to_first_token and start_time is not None:
first_token_time = time.perf_counter()
streaming_time_to_first_token.record(
first_token_time - start_time,
attributes={SpanAttributes.LLM_SYSTEM: "Ollama"},
)
first_token = False
yield res
if llm_request_type == LLMRequestTypeValues.CHAT:
accumulated_response["message"]["content"] += res["message"]["content"]
accumulated_response["message"]["role"] = res["message"]["role"]
elif llm_request_type == LLMRequestTypeValues.COMPLETION:
text = res.get("response", "")
accumulated_response["response"] += text
# Record streaming time to generate after the response is complete
if streaming_time_to_generate and first_token_time is not None:
model_name = last_response.get("model") if last_response else None
streaming_time_to_generate.record(
time.perf_counter() - first_token_time,
attributes={
SpanAttributes.LLM_SYSTEM: "Ollama",
SpanAttributes.LLM_RESPONSE_MODEL: model_name,
},
)

packages/opentelemetry-instrumentation-ollama/opentelemetry/instrumentation/ollama/__init__.py#L335-L342

if first_token and streaming_time_to_first_token and start_time is not None:
first_token_time = time.perf_counter()
streaming_time_to_first_token.record(
first_token_time - start_time,
attributes={SpanAttributes.LLM_SYSTEM: "Ollama"},
)
first_token = False

Fix in Cursor


Bug: Empty Response Data Not Processed

The if response_data: condition prevents _set_response_attributes from being called when response_data is an empty dictionary {}. This affects both synchronous and asynchronous paths. Since empty dictionaries are falsy, valid empty responses are not processed, resulting in missing span attributes. The condition should be if response_data is not None:.

packages/opentelemetry-instrumentation-ollama/opentelemetry/instrumentation/ollama/__init__.py#L309-L311

)
if response_data:
_set_response_attributes(span, token_histogram, llm_request_type, response_data | accumulated_response)

packages/opentelemetry-instrumentation-ollama/opentelemetry/instrumentation/ollama/__init__.py#L367-L369

)
if response_data:
_set_response_attributes(span, token_histogram, llm_request_type, response_data | accumulated_response)

Fix in Cursor


BugBot free trial expires on July 22, 2025
You have used $0.00 of your $50.00 spend limit so far. Manage your spend limit in the Cursor dashboard.

Was this report helpful? Give feedback by reacting with 👍 or 👎

@nirga nirga merged commit 622d1e4 into traceloop:main Jun 28, 2025
10 checks passed
amitalokbera pushed a commit to amitalokbera/openllmetry that referenced this pull request Jul 15, 2025
Co-authored-by: Nir Gazit <nirga@users.noreply.github.com>
nina-kollman pushed a commit that referenced this pull request Aug 11, 2025
Co-authored-by: Nir Gazit <nirga@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

🚀 Feature: Add LLM_STREAMING_TIME_TO_GENERATE to ollama instrumentation

2 participants