-
Notifications
You must be signed in to change notification settings - Fork 6
openai: Fix span closure when using context manager #80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 2 commits
fc16998
47e8356
a5e5000
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -67,6 +67,7 @@ def __init__( | |
self.choices = [] | ||
self.usage = None | ||
self.service_tier = None | ||
self.in_context_manager = False | ||
|
||
def end(self, exc=None): | ||
if exc is not None: | ||
|
@@ -111,6 +112,14 @@ def process_chunk(self, chunk): | |
if hasattr(chunk, "service_tier"): | ||
self.service_tier = chunk.service_tier | ||
|
||
def __enter__(self): | ||
# flag we are inside a context manager and want to follow its lifetime | ||
self.in_context_manager = True | ||
return self | ||
|
||
def __exit__(self, exc_type, exc_value, traceback): | ||
self.end(exc_value) | ||
|
||
def __iter__(self): | ||
stream = self.__wrapped__ | ||
try: | ||
|
@@ -122,7 +131,16 @@ def __iter__(self): | |
except Exception as exc: | ||
self.end(exc) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we need to guard this similarly |
||
raise | ||
self.end() | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. To confirm, without the changes this doesn't get called in the new test cases? It's unexpected to me since they seem to consume the stream in a for loop so wondering if we know why it would be a problem. A case that does intuitively seem to require this change is one that breaks during iteration with response:
for chunk in response:
first = chunk
break There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also if that is a fail case, we may need to consume the stream in the exit methods to get the stop / token info. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks for probing into this! TL;DR While digging further I think I found the most likely root cause of missing spans in our chatbot-rag-app and it's not multiple attempts to end a span, but default ObjectProxy behaviour when proxied object gets used as context manager. The fix inspired by the initial patch and refactor still applies, but it sort of solved the problem by chance. Sorry for the verbose root causing bellow, but had to leave this somewhere (led by Riccardo's example 😄) All right, so I dug a bit further into it and found that on the unit test this does get called (when not guarded) and the unit tests capture regression in terms of the span count, while we also get the warning for trying to close an already ended span. While a good regression to catch, it's not the what I set out to do with the change so thanks for drilling into this.
That's an interesting one and raises a question of whether we'd want to capture the tokens before the exit or the token info that that application chooses to deal with or the token information from the stream response. That seems like a weird edge case though? And may not be the right thing to resolve as part of this PR. Back to the original issue. Thanks to your comment I got to the point of debugging where I found My assumption is that somewhere downstream (langchain/openai) proxied object is used with context manager and not implementing enter/exit overrides on the Seems like the initial patch from @xrmx fixed the span closure warnings and also resolved this issue, but it's good to know that what I explained above was causing missing traces and no re-attempting the span closure. Either way, I will proceed with your suggestion to add |
||
# if we are inside a context manager we'll end when exiting it | ||
if not self.in_context_manager: | ||
self.end() | ||
|
||
async def __aenter__(self): | ||
# No difference in behavior between sync and async context manager | ||
return self.__enter__() | ||
|
||
async def __aexit__(self, exc_type, exc_value, traceback): | ||
self.__exit__(exc_type, exc_value, traceback) | ||
|
||
async def __aiter__(self): | ||
stream = self.__wrapped__ | ||
|
@@ -135,4 +153,6 @@ async def __aiter__(self): | |
except Exception as exc: | ||
self.end(exc) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ditto |
||
raise | ||
self.end() | ||
# if we are inside a context manager we'll end when exiting it | ||
if not self.in_context_manager: | ||
self.end() |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,114 @@ | ||
interactions: | ||
- request: | ||
body: |- | ||
{ | ||
"messages": [ | ||
{ | ||
"role": "user", | ||
"content": "Answer in up to 3 words: Which ocean contains Bouvet Island?" | ||
} | ||
], | ||
"model": "gpt-4o-mini", | ||
"stream": true | ||
} | ||
headers: | ||
accept: | ||
- application/json | ||
accept-encoding: | ||
- gzip, deflate | ||
authorization: | ||
- Bearer test_openai_api_key | ||
connection: | ||
- keep-alive | ||
content-length: | ||
- '147' | ||
content-type: | ||
- application/json | ||
host: | ||
- api.openai.com | ||
user-agent: | ||
- OpenAI/Python 1.66.5 | ||
x-stainless-arch: | ||
- arm64 | ||
x-stainless-async: | ||
- 'false' | ||
x-stainless-lang: | ||
- python | ||
x-stainless-os: | ||
- MacOS | ||
x-stainless-package-version: | ||
- 1.66.5 | ||
x-stainless-read-timeout: | ||
- '600' | ||
x-stainless-retry-count: | ||
- '0' | ||
x-stainless-runtime: | ||
- CPython | ||
x-stainless-runtime-version: | ||
- 3.12.6 | ||
method: POST | ||
uri: https://api.openai.com/v1/chat/completions | ||
response: | ||
body: | ||
string: |+ | ||
data: {"id":"chatcmpl-BOja7e365tj5upRjLFinadEB8ZoDL","object":"chat.completion.chunk","created":1745234787,"model":"gpt-4o-mini-2024-07-18","service_tier":"default","system_fingerprint":"fp_dbaca60df0","choices":[{"index":0,"delta":{"role":"assistant","content":"","refusal":null},"logprobs":null,"finish_reason":null}]} | ||
|
||
data: {"id":"chatcmpl-BOja7e365tj5upRjLFinadEB8ZoDL","object":"chat.completion.chunk","created":1745234787,"model":"gpt-4o-mini-2024-07-18","service_tier":"default","system_fingerprint":"fp_dbaca60df0","choices":[{"index":0,"delta":{"content":"South"},"logprobs":null,"finish_reason":null}]} | ||
|
||
data: {"id":"chatcmpl-BOja7e365tj5upRjLFinadEB8ZoDL","object":"chat.completion.chunk","created":1745234787,"model":"gpt-4o-mini-2024-07-18","service_tier":"default","system_fingerprint":"fp_dbaca60df0","choices":[{"index":0,"delta":{"content":" Atlantic"},"logprobs":null,"finish_reason":null}]} | ||
|
||
data: {"id":"chatcmpl-BOja7e365tj5upRjLFinadEB8ZoDL","object":"chat.completion.chunk","created":1745234787,"model":"gpt-4o-mini-2024-07-18","service_tier":"default","system_fingerprint":"fp_dbaca60df0","choices":[{"index":0,"delta":{"content":" Ocean"},"logprobs":null,"finish_reason":null}]} | ||
|
||
data: {"id":"chatcmpl-BOja7e365tj5upRjLFinadEB8ZoDL","object":"chat.completion.chunk","created":1745234787,"model":"gpt-4o-mini-2024-07-18","service_tier":"default","system_fingerprint":"fp_dbaca60df0","choices":[{"index":0,"delta":{"content":"."},"logprobs":null,"finish_reason":null}]} | ||
|
||
data: {"id":"chatcmpl-BOja7e365tj5upRjLFinadEB8ZoDL","object":"chat.completion.chunk","created":1745234787,"model":"gpt-4o-mini-2024-07-18","service_tier":"default","system_fingerprint":"fp_dbaca60df0","choices":[{"index":0,"delta":{},"logprobs":null,"finish_reason":"stop"}]} | ||
|
||
data: [DONE] | ||
|
||
headers: | ||
CF-RAY: | ||
- 933c86cb9ae5773e-LHR | ||
Connection: | ||
- keep-alive | ||
Content-Type: | ||
- text/event-stream; charset=utf-8 | ||
Date: | ||
- Mon, 21 Apr 2025 11:26:28 GMT | ||
Server: | ||
- cloudflare | ||
Set-Cookie: test_set_cookie | ||
Transfer-Encoding: | ||
- chunked | ||
X-Content-Type-Options: | ||
- nosniff | ||
access-control-expose-headers: | ||
- X-Request-ID | ||
alt-svc: | ||
- h3=":443"; ma=86400 | ||
cf-cache-status: | ||
- DYNAMIC | ||
openai-organization: test_openai_org_id | ||
openai-processing-ms: | ||
- '460' | ||
openai-version: | ||
- '2020-10-01' | ||
strict-transport-security: | ||
- max-age=31536000; includeSubDomains; preload | ||
x-ratelimit-limit-requests: | ||
- '200' | ||
x-ratelimit-limit-tokens: | ||
- '100000' | ||
x-ratelimit-remaining-requests: | ||
- '199' | ||
x-ratelimit-remaining-tokens: | ||
- '88447' | ||
x-ratelimit-reset-requests: | ||
- 7m12s | ||
x-ratelimit-reset-tokens: | ||
- 83h10m39.057s | ||
x-request-id: | ||
- req_a39b652ec0ccb45a968e10112e569bf4 | ||
status: | ||
code: 200 | ||
message: OK | ||
version: 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about instead of this add
self.ended
and just guard both calls by it? I think it's better to end when the stream is finished when possible and only use the context manager hook as a fallback since the context manager could potentially be left open for a very long time even for unrelated processing depending on user code.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion - this was a good catch and a pointer whilst ramping up. And yeah, seems better as it avoids creating misleading spans. I'll update this and move the guarding to the
def end(...)
given it's called in many places as you surfaced elsewhere.