feat(ollama): add meter STTG to ollama instrumentation by minimAluminiumalism · Pull Request #3053 · traceloop/openllmetry

minimAluminiumalism · 2025-06-28T18:15:27Z

I have added tests that cover my changes.
If adding a new instrumentation or changing an existing one, I've added screenshots from some observability platform showing the change.
PR name follows conventional commits format: feat(instrumentation): ... or fix(instrumentation): ....
(If applicable) I have updated the documentation accordingly.

Important

Adds streaming_time_to_generate metric to Ollama instrumentation for measuring time from first token to completion in streaming responses, with corresponding tests and updates.

Behavior:
- Adds streaming_time_to_generate metric to measure time from first token to completion in streaming responses in __init__.py.
- Updates _accumulate_streaming_response() and _aaccumulate_streaming_response() to record streaming_time_to_generate.
- Modifies _wrap() and _awrap() to handle new metric.
Tests:
- Adds test_ollama_streaming_time_to_generate_metrics() in test_ollama_metrics.py to verify new metric.
- Includes VCR cassette test_ollama_streaming_time_to_generate_metrics.yaml for HTTP interaction recording.
Misc:
- Updates Meters.LLM_STREAMING_TIME_TO_GENERATE in semconv_ai/__init__.py to reflect new metric naming.

^{This description was created by}^{for e96286b. You can customize this summary. It will automatically update as commits are pushed.}

…etry

ellipsis-dev

Caution

Changes requested ❌

Reviewed everything up to e96286b in 1 minute and 15 seconds. Click for details.

Reviewed 395 lines of code in 4 files
Skipped 0 files when reviewing.
Skipped posting 2 draft comments. View those below.
Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

1. packages/opentelemetry-instrumentation-ollama/opentelemetry/instrumentation/ollama/__init__.py:291

Draft comment:
Consider capturing the final response explicitly instead of relying on the if 'res' in locals() check for obtaining the model attribute. This would improve clarity and maintainability.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 0% vs. threshold = 50% The comment identifies a real code clarity issue. Using locals() to check variable existence is a bit hacky. However, the suggested fix is wrong - simply removing the check would cause errors if the loop never executes, since 'res' wouldn't be defined. The code needs the safety check. The current implementation, while not ideal, is actually doing the right thing functionally. The comment correctly identifies a code clarity issue but proposes a solution that would introduce bugs. There may be better ways to handle this edge case than using locals(). While the code could be clearer, the current implementation is actually correct and handles edge cases properly. The suggested change would break functionality. The comment should be deleted because its suggested fix would introduce bugs, even though it correctly identifies a clarity issue.

2. packages/opentelemetry-semantic-conventions-ai/opentelemetry/semconv_ai/__init__.py:11

Draft comment:
Verify that the updated metric name 'llm.chat_completions.streaming_time_to_generate' aligns with the intended semantic conventions for Ollama instrumentation. The prefix change from 'llm.openai...' may be intentional but warrants confirmation.
Reason this comment was not posted:
Comment did not seem useful. Confidence is useful = 0% <= threshold 50% The comment is asking the PR author to confirm their intention regarding the metric name change, which violates the rule against asking for confirmation of intention. It does not provide a specific code suggestion or ask for a specific test to be written.

Workflow ID: wflow_toUpoPR3hoAbh7XH

^{You can customize}^{by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.}

packages/opentelemetry-instrumentation-ollama/tests/test_ollama_metrics.py

…mentation

cursor

Bug: Metric Dependency Issue in Streaming Code

The streaming_time_to_generate metric is not recorded when streaming_time_to_first_token is disabled. This occurs because first_token_time, which is required for streaming_time_to_generate, is only set if streaming_time_to_first_token is enabled and start_time is not None. This creates an unintended dependency between these two independent streaming metrics, affecting both synchronous and asynchronous code paths. first_token_time should be set if either metric is enabled.

packages/opentelemetry-instrumentation-ollama/opentelemetry/instrumentation/ollama/__init__.py#L277-L303

openllmetry/packages/opentelemetry-instrumentation-ollama/opentelemetry/instrumentation/ollama/__init__.py

Lines 277 to 303 in dce3a6b

    
               if first_token and streaming_time_to_first_token and start_time is not None: 
        
                   first_token_time = time.perf_counter() 
        
                   streaming_time_to_first_token.record( 
        
                       first_token_time - start_time, 
        
                       attributes={SpanAttributes.LLM_SYSTEM: "Ollama"}, 
        
                   ) 
        
                   first_token = False 
        
               yield res 
        
               if llm_request_type == LLMRequestTypeValues.CHAT: 
        
                   accumulated_response["message"]["content"] += res["message"]["content"] 
        
                   accumulated_response["message"]["role"] = res["message"]["role"] 
        
               elif llm_request_type == LLMRequestTypeValues.COMPLETION: 
        
                   text = res.get("response", "") 
        
                   accumulated_response["response"] += text 
        
           # Record streaming time to generate after the response is complete 
        
           if streaming_time_to_generate and first_token_time is not None: 
        
               model_name = last_response.get("model") if last_response else None 
        
               streaming_time_to_generate.record( 
        
                   time.perf_counter() - first_token_time, 
        
                   attributes={ 
        
                       SpanAttributes.LLM_SYSTEM: "Ollama", 
        
                       SpanAttributes.LLM_RESPONSE_MODEL: model_name, 
        
                   }, 
        
               )

packages/opentelemetry-instrumentation-ollama/opentelemetry/instrumentation/ollama/__init__.py#L335-L342

openllmetry/packages/opentelemetry-instrumentation-ollama/opentelemetry/instrumentation/ollama/__init__.py

Lines 335 to 342 in dce3a6b

    
           if first_token and streaming_time_to_first_token and start_time is not None: 
        
               first_token_time = time.perf_counter() 
        
               streaming_time_to_first_token.record( 
        
                   first_token_time - start_time, 
        
                   attributes={SpanAttributes.LLM_SYSTEM: "Ollama"}, 
        
               ) 
        
               first_token = False

Fix in Cursor

Bug: Empty Response Data Not Processed

The if response_data: condition prevents _set_response_attributes from being called when response_data is an empty dictionary {}. This affects both synchronous and asynchronous paths. Since empty dictionaries are falsy, valid empty responses are not processed, resulting in missing span attributes. The condition should be if response_data is not None:.

packages/opentelemetry-instrumentation-ollama/opentelemetry/instrumentation/ollama/__init__.py#L309-L311

openllmetry/packages/opentelemetry-instrumentation-ollama/opentelemetry/instrumentation/ollama/__init__.py

Lines 309 to 311 in dce3a6b

    
           ) 
        
           if response_data: 
        
               _set_response_attributes(span, token_histogram, llm_request_type, response_data | accumulated_response)

packages/opentelemetry-instrumentation-ollama/opentelemetry/instrumentation/ollama/__init__.py#L367-L369

openllmetry/packages/opentelemetry-instrumentation-ollama/opentelemetry/instrumentation/ollama/__init__.py

Lines 367 to 369 in dce3a6b

    
           ) 
        
           if response_data: 
        
               _set_response_attributes(span, token_histogram, llm_request_type, response_data | accumulated_response)

Fix in Cursor

BugBot free trial expires on July 22, 2025
You have used $0.00 of your $50.00 spend limit so far. Manage your spend limit in the Cursor dashboard.

Was this report helpful? Give feedback by reacting with 👍 or 👎

Co-authored-by: Nir Gazit <nirga@users.noreply.github.com>

minimAluminiumalism and others added 21 commits April 17, 2025 16:24

fix(ollama): type error in dict combination of ollama instrumentation

58f9a1d

Merge branch 'main' into main

27d24d4

fix(ollama): type error in dict combination of ollama instrumentation

dbfd9ca

Merge branch 'main' of https://github.com/minimAluminiumalism/openllm…

e890cba

…etry

Merge branch 'main' of https://github.com/traceloop/openllmetry

c75adce

Merge branch 'main' of https://github.com/minimAluminiumalism/openllm…

d886b64

…etry

fix(ollama): pre-imported funcs instrumentation failure

61c063c

fix(ollama): type error in dict combination of ollama instrumentation

1f7e2c4

fix(ollama): pre-imported funcs instrumentation failure

817b146

Merge branch 'main' of https://github.com/minimAluminiumalism/openllm…

34f9959

…etry

fix(ollama): pre-imported funcs instrumentation failure

2097d5f

Merge branch 'main' into main

2968a22

Merge branch 'main' of https://github.com/traceloop/openllmetry

8e0e4f1

Merge branch 'main' of https://github.com/minimAluminiumalism/openllm…

1fbda72

…etry

Merge branch 'main' of https://github.com/traceloop/openllmetry

c2d1ce4

feat(ollama): add TTFT to streaming mode

f3699f8

Merge branch 'main' into main

d503c1f

Merge branch 'main' of https://github.com/traceloop/openllmetry

f42a4c5

Merge branch 'main' of https://github.com/minimAluminiumalism/openllm…

5a899f8

…etry

Merge branch 'main' of https://github.com/traceloop/openllmetry

5a944ea

Merge branch 'main' of https://github.com/traceloop/openllmetry

6506898

ellipsis-dev bot reviewed Jun 28, 2025

View reviewed changes

packages/opentelemetry-instrumentation-ollama/tests/test_ollama_metrics.py Outdated Show resolved Hide resolved

feat(ollama): add meter streaming_time_to_generate to ollama instru…

0a21005

…mentation

minimAluminiumalism force-pushed the main branch from e96286b to 0a21005 Compare June 28, 2025 18:32

nirga approved these changes Jun 28, 2025

View reviewed changes

Merge branch 'main' into main

dce3a6b

cursor bot reviewed Jun 28, 2025

View reviewed changes

nirga merged commit 622d1e4 into traceloop:main Jun 28, 2025
10 checks passed

amitalokbera pushed a commit to amitalokbera/openllmetry that referenced this pull request Jul 15, 2025

feat(ollama): add meter STTG to ollama instrumentation (traceloop#3053)

bed4ef8

Co-authored-by: Nir Gazit <nirga@users.noreply.github.com>

nina-kollman pushed a commit that referenced this pull request Aug 11, 2025

feat(ollama): add meter STTG to ollama instrumentation (#3053)

50384ba

Co-authored-by: Nir Gazit <nirga@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(ollama): add meter STTG to ollama instrumentation#3053

feat(ollama): add meter STTG to ollama instrumentation#3053
nirga merged 23 commits intotraceloop:mainfrom
minimAluminiumalism:main

minimAluminiumalism commented Jun 28, 2025 •

edited by nirga

Loading

Uh oh!

ellipsis-dev bot left a comment

Uh oh!

Uh oh!

cursor bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants


	if first_token and streaming_time_to_first_token and start_time is not None:
	first_token_time = time.perf_counter()
	streaming_time_to_first_token.record(
	first_token_time - start_time,
	attributes={SpanAttributes.LLM_SYSTEM: "Ollama"},
	)
	first_token = False
	yield res

	if llm_request_type == LLMRequestTypeValues.CHAT:
	accumulated_response["message"]["content"] += res["message"]["content"]
	accumulated_response["message"]["role"] = res["message"]["role"]
	elif llm_request_type == LLMRequestTypeValues.COMPLETION:
	text = res.get("response", "")
	accumulated_response["response"] += text

	# Record streaming time to generate after the response is complete
	if streaming_time_to_generate and first_token_time is not None:
	model_name = last_response.get("model") if last_response else None
	streaming_time_to_generate.record(
	time.perf_counter() - first_token_time,
	attributes={
	SpanAttributes.LLM_SYSTEM: "Ollama",
	SpanAttributes.LLM_RESPONSE_MODEL: model_name,
	},
	)

	)
	if response_data:
	_set_response_attributes(span, token_histogram, llm_request_type, response_data \| accumulated_response)

Conversation

minimAluminiumalism commented Jun 28, 2025 • edited by nirga Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Bug: Metric Dependency Issue in Streaming Code

Bug: Empty Response Data Not Processed

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

minimAluminiumalism commented Jun 28, 2025 •

edited by nirga

Loading