-
Notifications
You must be signed in to change notification settings - Fork 6
openai: Fix span closure when using context manager #80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
3 commits
Select commit
Hold shift + click to select a range
fc16998
openai: Add conditional span closure logic depending on whether strea…
SrdjanLL 47e8356
openai: Add conditional span closure logic for async streaming comple…
SrdjanLL a5e5000
openai: Guard span ending based on the flag rather than the context m…
SrdjanLL File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
114 changes: 114 additions & 0 deletions
114
...lemetry-instrumentation-openai/tests/cassettes/test_chat_stream_with_context_manager.yaml
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,114 @@ | ||
interactions: | ||
- request: | ||
body: |- | ||
{ | ||
"messages": [ | ||
{ | ||
"role": "user", | ||
"content": "Answer in up to 3 words: Which ocean contains Bouvet Island?" | ||
} | ||
], | ||
"model": "gpt-4o-mini", | ||
"stream": true | ||
} | ||
headers: | ||
accept: | ||
- application/json | ||
accept-encoding: | ||
- gzip, deflate | ||
authorization: | ||
- Bearer test_openai_api_key | ||
connection: | ||
- keep-alive | ||
content-length: | ||
- '147' | ||
content-type: | ||
- application/json | ||
host: | ||
- api.openai.com | ||
user-agent: | ||
- OpenAI/Python 1.66.5 | ||
x-stainless-arch: | ||
- arm64 | ||
x-stainless-async: | ||
- 'false' | ||
x-stainless-lang: | ||
- python | ||
x-stainless-os: | ||
- MacOS | ||
x-stainless-package-version: | ||
- 1.66.5 | ||
x-stainless-read-timeout: | ||
- '600' | ||
x-stainless-retry-count: | ||
- '0' | ||
x-stainless-runtime: | ||
- CPython | ||
x-stainless-runtime-version: | ||
- 3.12.6 | ||
method: POST | ||
uri: https://api.openai.com/v1/chat/completions | ||
response: | ||
body: | ||
string: |+ | ||
data: {"id":"chatcmpl-BOja7e365tj5upRjLFinadEB8ZoDL","object":"chat.completion.chunk","created":1745234787,"model":"gpt-4o-mini-2024-07-18","service_tier":"default","system_fingerprint":"fp_dbaca60df0","choices":[{"index":0,"delta":{"role":"assistant","content":"","refusal":null},"logprobs":null,"finish_reason":null}]} | ||
|
||
data: {"id":"chatcmpl-BOja7e365tj5upRjLFinadEB8ZoDL","object":"chat.completion.chunk","created":1745234787,"model":"gpt-4o-mini-2024-07-18","service_tier":"default","system_fingerprint":"fp_dbaca60df0","choices":[{"index":0,"delta":{"content":"South"},"logprobs":null,"finish_reason":null}]} | ||
|
||
data: {"id":"chatcmpl-BOja7e365tj5upRjLFinadEB8ZoDL","object":"chat.completion.chunk","created":1745234787,"model":"gpt-4o-mini-2024-07-18","service_tier":"default","system_fingerprint":"fp_dbaca60df0","choices":[{"index":0,"delta":{"content":" Atlantic"},"logprobs":null,"finish_reason":null}]} | ||
|
||
data: {"id":"chatcmpl-BOja7e365tj5upRjLFinadEB8ZoDL","object":"chat.completion.chunk","created":1745234787,"model":"gpt-4o-mini-2024-07-18","service_tier":"default","system_fingerprint":"fp_dbaca60df0","choices":[{"index":0,"delta":{"content":" Ocean"},"logprobs":null,"finish_reason":null}]} | ||
|
||
data: {"id":"chatcmpl-BOja7e365tj5upRjLFinadEB8ZoDL","object":"chat.completion.chunk","created":1745234787,"model":"gpt-4o-mini-2024-07-18","service_tier":"default","system_fingerprint":"fp_dbaca60df0","choices":[{"index":0,"delta":{"content":"."},"logprobs":null,"finish_reason":null}]} | ||
|
||
data: {"id":"chatcmpl-BOja7e365tj5upRjLFinadEB8ZoDL","object":"chat.completion.chunk","created":1745234787,"model":"gpt-4o-mini-2024-07-18","service_tier":"default","system_fingerprint":"fp_dbaca60df0","choices":[{"index":0,"delta":{},"logprobs":null,"finish_reason":"stop"}]} | ||
|
||
data: [DONE] | ||
|
||
headers: | ||
CF-RAY: | ||
- 933c86cb9ae5773e-LHR | ||
Connection: | ||
- keep-alive | ||
Content-Type: | ||
- text/event-stream; charset=utf-8 | ||
Date: | ||
- Mon, 21 Apr 2025 11:26:28 GMT | ||
Server: | ||
- cloudflare | ||
Set-Cookie: test_set_cookie | ||
Transfer-Encoding: | ||
- chunked | ||
X-Content-Type-Options: | ||
- nosniff | ||
access-control-expose-headers: | ||
- X-Request-ID | ||
alt-svc: | ||
- h3=":443"; ma=86400 | ||
cf-cache-status: | ||
- DYNAMIC | ||
openai-organization: test_openai_org_id | ||
openai-processing-ms: | ||
- '460' | ||
openai-version: | ||
- '2020-10-01' | ||
strict-transport-security: | ||
- max-age=31536000; includeSubDomains; preload | ||
x-ratelimit-limit-requests: | ||
- '200' | ||
x-ratelimit-limit-tokens: | ||
- '100000' | ||
x-ratelimit-remaining-requests: | ||
- '199' | ||
x-ratelimit-remaining-tokens: | ||
- '88447' | ||
x-ratelimit-reset-requests: | ||
- 7m12s | ||
x-ratelimit-reset-tokens: | ||
- 83h10m39.057s | ||
x-request-id: | ||
- req_a39b652ec0ccb45a968e10112e569bf4 | ||
status: | ||
code: 200 | ||
message: OK | ||
version: 1 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To confirm, without the changes this doesn't get called in the new test cases? It's unexpected to me since they seem to consume the stream in a for loop so wondering if we know why it would be a problem.
A case that does intuitively seem to require this change is one that breaks during iteration
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also if that is a fail case, we may need to consume the stream in the exit methods to get the stop / token info.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for probing into this!
TL;DR While digging further I think I found the most likely root cause of missing spans in our chatbot-rag-app and it's not multiple attempts to end a span, but default ObjectProxy behaviour when proxied object gets used as context manager. The fix inspired by the initial patch and refactor still applies, but it sort of solved the problem by chance.
Sorry for the verbose root causing bellow, but had to leave this somewhere (led by Riccardo's example 😄)
All right, so I dug a bit further into it and found that on the unit test this does get called (when not guarded) and the unit tests capture regression in terms of the span count, while we also get the warning for trying to close an already ended span. While a good regression to catch, it's not the what I set out to do with the change so thanks for drilling into this.
That's an interesting one and raises a question of whether we'd want to capture the tokens before the exit or the token info that that application chooses to deal with or the token information from the stream response. That seems like a weird edge case though? And may not be the right thing to resolve as part of this PR.
Back to the original issue. Thanks to your comment I got to the point of debugging where I found
__iter__
doesn't get invoked unless__enter__
and__exit__
methods were are added toStreamWrapper
. Counterintuitively. No obvious clues on our end why these two different concepts have dependency.My assumption is that somewhere downstream (langchain/openai) proxied object is used with context manager and not implementing enter/exit overrides on the
StreamWrapper
proxy seem to use proxied object's behaviour and not resulting in the generator invocation on the proxy. The closest thing I was able to find related to this forwrapt.ObjectProxy
can be found here, which concerns over proxyingaiohttp.request
that uses (async) context manager and seems to be used by openai/langchain as well. I found relevant code on theBaseChatOpenAI
that seems the most likely cause in our use case. Implementing context manager enter/exit on ourStreamWrapper
doesn't affect the behavior of the instrumented code, but it solves the problem of generator invocation.The I added check the span count and fail with 0 spans when context manager is not implemented which should catch regressions on this in the future.
Seems like the initial patch from @xrmx fixed the span closure warnings and also resolved this issue, but it's good to know that what I explained above was causing missing traces and no re-attempting the span closure.
Either way, I will proceed with your suggestion to add
ended
flag and guard span ending in a single place instead as that seem to cover both the warning of attempt to end an already ended span and this weird regression.