-
Notifications
You must be signed in to change notification settings - Fork 6
openai: Fix span closure when using context manager #80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…med completion uses context manager or not
self.choices = [] | ||
self.usage = None | ||
self.service_tier = None | ||
self.in_context_manager = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about instead of this add self.ended
and just guard both calls by it? I think it's better to end when the stream is finished when possible and only use the context manager hook as a fallback since the context manager could potentially be left open for a very long time even for unrelated processing depending on user code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion - this was a good catch and a pointer whilst ramping up. And yeah, seems better as it avoids creating misleading spans. I'll update this and move the guarding to the def end(...)
given it's called in many places as you surfaced elsewhere.
self.process_chunk(chunk) | ||
yield chunk | ||
except Exception as exc: | ||
self.end(exc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to guard this similarly
self.process_chunk(chunk) | ||
yield chunk | ||
except Exception as exc: | ||
self.end(exc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto
except Exception as exc: | ||
self.end(exc) | ||
raise | ||
self.end() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To confirm, without the changes this doesn't get called in the new test cases? It's unexpected to me since they seem to consume the stream in a for loop so wondering if we know why it would be a problem.
A case that does intuitively seem to require this change is one that breaks during iteration
with response:
for chunk in response:
first = chunk
break
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also if that is a fail case, we may need to consume the stream in the exit methods to get the stop / token info.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for probing into this!
TL;DR While digging further I think I found the most likely root cause of missing spans in our chatbot-rag-app and it's not multiple attempts to end a span, but default ObjectProxy behaviour when proxied object gets used as context manager. The fix inspired by the initial patch and refactor still applies, but it sort of solved the problem by chance.
Sorry for the verbose root causing bellow, but had to leave this somewhere (led by Riccardo's example 😄)
All right, so I dug a bit further into it and found that on the unit test this does get called (when not guarded) and the unit tests capture regression in terms of the span count, while we also get the warning for trying to close an already ended span. While a good regression to catch, it's not the what I set out to do with the change so thanks for drilling into this.
Also if that is a fail case, we may need to consume the stream in the exit methods to get the stop / token info.
That's an interesting one and raises a question of whether we'd want to capture the tokens before the exit or the token info that that application chooses to deal with or the token information from the stream response. That seems like a weird edge case though? And may not be the right thing to resolve as part of this PR.
Back to the original issue. Thanks to your comment I got to the point of debugging where I found __iter__
doesn't get invoked unless __enter__
and __exit__
methods were are added toStreamWrapper
. Counterintuitively. No obvious clues on our end why these two different concepts have dependency.
My assumption is that somewhere downstream (langchain/openai) proxied object is used with context manager and not implementing enter/exit overrides on the StreamWrapper
proxy seem to use proxied object's behaviour and not resulting in the generator invocation on the proxy. The closest thing I was able to find related to this for wrapt.ObjectProxy
can be found here, which concerns over proxying aiohttp.request
that uses (async) context manager and seems to be used by openai/langchain as well. I found relevant code on the BaseChatOpenAI
that seems the most likely cause in our use case. Implementing context manager enter/exit on our StreamWrapper
doesn't affect the behavior of the instrumented code, but it solves the problem of generator invocation.
The I added check the span count and fail with 0 spans when context manager is not implemented which should catch regressions on this in the future.
Seems like the initial patch from @xrmx fixed the span closure warnings and also resolved this issue, but it's good to know that what I explained above was causing missing traces and no re-attempting the span closure.
Either way, I will proceed with your suggestion to add ended
flag and guard span ending in a single place instead as that seem to cover both the warning of attempt to end an already ended span and this weird regression.
@SrdjanLL want to verify this fix end-to-end? I think you can substitute with this
then run chatbot-rag-app with python, or add git temporarily to its Dockerfile so it can resolve that requirement like this: RUN apt-get update \
&& apt-get install -y --no-install-recommends git \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/* |
@codefromthecrypt - thanks for the suggestion! Do you think there's anything missing in that feedback loop? I will follow your suggestion to make sure it works as well (and as part of setting up my feedback loops). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code lgtm, thanks!
@SrdjanLL never mind me as your approach works fine. just sometimes I've had eyes crossed on my venvs and want something docker can prove is repeatable. |
What does this pull request do?
StreamWrapper
to close the span only once, either when context manager exits or when end of stream is reachedTesting
WARNING opentelemetry.sdk.trace:__init__.py:943 Calling end() on an ended span.
Related issues
Next steps
1.1.0
of openai instrumentation after the PR is merged.