openai: Fix span closure when using context manager #80

SrdjanLL · 2025-04-21T15:25:15Z

What does this pull request do?

Resolves the issue of missing openai instrumentation surface in our "dogfood" chatbot rag app.
- Solution inspired by amazing deep dive from @xrmx , with some minor additions to resolve the issue for async calls too.
UpdateStreamWrapper to close the span only once, either when context manager exits or when end of stream is reached
- Same logic is applied for both sync and async

Testing

After reproducing the issue following these steps, I applied the patch mentioned in the same link.
Confirmed the patch resolves the issue
Wrote unit tests for chat_stream* methods for cases when context manager is used and confirmed that without a patch these tests fail and issue WARNING opentelemetry.sdk.trace:__init__.py:943 Calling end() on an ended span.
Applied the patch and expanded it to cover the async scenarios too.
Confirm unit tests are passing.

Related issues

Bug surfaced in: https://github.com/elastic/genai-instrumentation/issues/510

Next steps

Release 1.1.0 of openai instrumentation after the PR is merged.
Update our chatbot-rag-app to use the newest version.

…med completion uses context manager or not

…tion

anuraaga · 2025-04-21T23:16:12Z

...ic-opentelemetry-instrumentation-openai/src/opentelemetry/instrumentation/openai/wrappers.py

        self.choices = []
        self.usage = None
        self.service_tier = None
+        self.in_context_manager = False


How about instead of this add self.ended and just guard both calls by it? I think it's better to end when the stream is finished when possible and only use the context manager hook as a fallback since the context manager could potentially be left open for a very long time even for unrelated processing depending on user code.

Thanks for the suggestion - this was a good catch and a pointer whilst ramping up. And yeah, seems better as it avoids creating misleading spans. I'll update this and move the guarding to the def end(...) given it's called in many places as you surfaced elsewhere.

anuraaga · 2025-04-21T23:26:16Z

...ic-opentelemetry-instrumentation-openai/src/opentelemetry/instrumentation/openai/wrappers.py

                self.process_chunk(chunk)
                yield chunk
        except Exception as exc:
            self.end(exc)


I think we need to guard this similarly

anuraaga · 2025-04-21T23:26:37Z

...ic-opentelemetry-instrumentation-openai/src/opentelemetry/instrumentation/openai/wrappers.py

                self.process_chunk(chunk)
                yield chunk
        except Exception as exc:
            self.end(exc)


anuraaga · 2025-04-21T23:31:46Z

...ic-opentelemetry-instrumentation-openai/src/opentelemetry/instrumentation/openai/wrappers.py

        except Exception as exc:
            self.end(exc)
            raise
-        self.end()


To confirm, without the changes this doesn't get called in the new test cases? It's unexpected to me since they seem to consume the stream in a for loop so wondering if we know why it would be a problem.

A case that does intuitively seem to require this change is one that breaks during iteration

with response: for chunk in response: first = chunk break

Also if that is a fail case, we may need to consume the stream in the exit methods to get the stop / token info.

Thanks for probing into this!

TL;DR While digging further I think I found the most likely root cause of missing spans in our chatbot-rag-app and it's not multiple attempts to end a span, but default ObjectProxy behaviour when proxied object gets used as context manager. The fix inspired by the initial patch and refactor still applies, but it sort of solved the problem by chance.

Sorry for the verbose root causing bellow, but had to leave this somewhere (led by Riccardo's example 😄)

All right, so I dug a bit further into it and found that on the unit test this does get called (when not guarded) and the unit tests capture regression in terms of the span count, while we also get the warning for trying to close an already ended span. While a good regression to catch, it's not the what I set out to do with the change so thanks for drilling into this.

Also if that is a fail case, we may need to consume the stream in the exit methods to get the stop / token info.

That's an interesting one and raises a question of whether we'd want to capture the tokens before the exit or the token info that that application chooses to deal with or the token information from the stream response. That seems like a weird edge case though? And may not be the right thing to resolve as part of this PR.

Back to the original issue. Thanks to your comment I got to the point of debugging where I found __iter__ doesn't get invoked unless __enter__ and __exit__ methods were are added toStreamWrapper. Counterintuitively. No obvious clues on our end why these two different concepts have dependency.

My assumption is that somewhere downstream (langchain/openai) proxied object is used with context manager and not implementing enter/exit overrides on the StreamWrapper proxy seem to use proxied object's behaviour and not resulting in the generator invocation on the proxy. The closest thing I was able to find related to this for wrapt.ObjectProxy can be found here, which concerns over proxying aiohttp.request that uses (async) context manager and seems to be used by openai/langchain as well. I found relevant code on the BaseChatOpenAI that seems the most likely cause in our use case. Implementing context manager enter/exit on our StreamWrapper doesn't affect the behavior of the instrumented code, but it solves the problem of generator invocation.
The I added check the span count and fail with 0 spans when context manager is not implemented which should catch regressions on this in the future.

Seems like the initial patch from @xrmx fixed the span closure warnings and also resolved this issue, but it's good to know that what I explained above was causing missing traces and no re-attempting the span closure.

Either way, I will proceed with your suggestion to add ended flag and guard span ending in a single place instead as that seem to cover both the warning of attempt to end an already ended span and this weird regression.

…anager

codefromthecrypt · 2025-04-23T04:39:07Z

@SrdjanLL want to verify this fix end-to-end?

here https://github.com/elastic/elasticsearch-labs/blob/main/example-apps/chatbot-rag-app/requirements.txt#L536

I think you can substitute with this

elastic-opentelemetry-instrumentation-openai @ git+https://github.com/SrdjanLL/elastic-otel-python-instrumentations.git@bugfix/span-closure#subdirectory=instrumentation/openai

then run chatbot-rag-app with python, or add git temporarily to its Dockerfile so it can resolve that requirement like this:

RUN apt-get update \
    && apt-get install -y --no-install-recommends git \
    && apt-get clean \
    && rm -rf /var/lib/apt/lists/*

SrdjanLL · 2025-04-23T06:42:15Z

@codefromthecrypt - thanks for the suggestion!
I've already made myself a nice feedback loop by using an editable installation (pip install -e) for my local version ofinstrumentation/openai and adding it to the virtualenv I use to run chatbot-rag-app with Python. This way I was able to confirm the traces are complete and the Gen AI dashboard has data.

Do you think there's anything missing in that feedback loop?

I will follow your suggestion to make sure it works as well (and as part of setting up my feedback loops).

anuraaga

Code lgtm, thanks!

codefromthecrypt · 2025-04-23T07:09:35Z

@SrdjanLL never mind me as your approach works fine. just sometimes I've had eyes crossed on my venvs and want something docker can prove is repeatable.

SrdjanLL added 2 commits April 21, 2025 15:18

openai: Add conditional span closure logic depending on whether strea…

fc16998

…med completion uses context manager or not

openai: Add conditional span closure logic for async streaming comple…

47e8356

…tion

SrdjanLL requested a review from codefromthecrypt April 21, 2025 15:25

anuraaga reviewed Apr 21, 2025

View reviewed changes

openai: Guard span ending based on the flag rather than the context m…

a5e5000

…anager

SrdjanLL requested a review from anuraaga April 22, 2025 20:42

anuraaga approved these changes Apr 23, 2025

View reviewed changes

SrdjanLL merged commit bee16d4 into elastic:main Apr 23, 2025
14 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

openai: Fix span closure when using context manager #80

openai: Fix span closure when using context manager #80

Uh oh!

SrdjanLL commented Apr 21, 2025 •

edited

Loading

Uh oh!

anuraaga Apr 21, 2025

Uh oh!

SrdjanLL Apr 22, 2025

Uh oh!

anuraaga Apr 21, 2025

Uh oh!

anuraaga Apr 21, 2025

Uh oh!

anuraaga Apr 21, 2025 •

edited

Loading

Uh oh!

anuraaga Apr 21, 2025

Uh oh!

SrdjanLL Apr 22, 2025 •

edited

Loading

Uh oh!

codefromthecrypt commented Apr 23, 2025

Uh oh!

SrdjanLL commented Apr 23, 2025

Uh oh!

anuraaga left a comment

Uh oh!

codefromthecrypt commented Apr 23, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

openai: Fix span closure when using context manager #80

openai: Fix span closure when using context manager #80

Uh oh!

Conversation

SrdjanLL commented Apr 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this pull request do?

Testing

Related issues

Next steps

Uh oh!

anuraaga Apr 21, 2025

Choose a reason for hiding this comment

Uh oh!

SrdjanLL Apr 22, 2025

Choose a reason for hiding this comment

Uh oh!

anuraaga Apr 21, 2025

Choose a reason for hiding this comment

Uh oh!

anuraaga Apr 21, 2025

Choose a reason for hiding this comment

Uh oh!

anuraaga Apr 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

anuraaga Apr 21, 2025

Choose a reason for hiding this comment

Uh oh!

SrdjanLL Apr 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

codefromthecrypt commented Apr 23, 2025

Uh oh!

SrdjanLL commented Apr 23, 2025

Uh oh!

anuraaga left a comment

Choose a reason for hiding this comment

Uh oh!

codefromthecrypt commented Apr 23, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

SrdjanLL commented Apr 21, 2025 •

edited

Loading

anuraaga Apr 21, 2025 •

edited

Loading

SrdjanLL Apr 22, 2025 •

edited

Loading