Skip to content
Closed
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
75 changes: 64 additions & 11 deletions src/agents/models/chatcmpl_stream_handler.py
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,9 @@ async def handle_stream(
usage: CompletionUsage | None = None
state = StreamingState()

is_reasoning_model = False
emit_reasoning_content = False
emit_content = False
async for chunk in stream:
if not state.started:
state.started = True
Expand All @@ -62,9 +65,16 @@ async def handle_stream(
continue

delta = chunk.choices[0].delta
reasoning_content = None
content = None
if hasattr(delta, "reasoning_content"):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when would this be true?

Copy link
Contributor Author

@Ddper Ddper Apr 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When using reasoning model like deepseek-reasoner
deepseek reasoning model

reasoning_content = delta.reasoning_content
is_reasoning_model = True
if hasattr(delta, "content"):
content = delta.content

# Handle text
if delta.content:
if reasoning_content or content:
if not state.text_content_index_and_output:
# Initialize a content tracker for streaming text
state.text_content_index_and_output = (
Expand Down Expand Up @@ -100,16 +110,59 @@ async def handle_stream(
),
type="response.content_part.added",
)
# Emit the delta for this segment of content
yield ResponseTextDeltaEvent(
content_index=state.text_content_index_and_output[0],
delta=delta.content,
item_id=FAKE_RESPONSES_ID,
output_index=0,
type="response.output_text.delta",
)
# Accumulate the text into the response part
state.text_content_index_and_output[1].text += delta.content

if reasoning_content is not None:
if not emit_reasoning_content:
emit_reasoning_content = True

reasoning_content_title = "# reasoning content\n\n"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this doesn't seem right - why hardcode?

Copy link
Contributor Author

@Ddper Ddper Apr 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a markdown title for splitting the content and reasoning content. It's a constant value so have to hardcode.

The whole output are like below:

# reasoning content

Okay, I need to write a haiku about recursion in programming. Let's start by recalling what a haiku is. It's a three-line poem with a 5-7-5 syllable structure. The first line has five syllables, the second seven, and the third five again. The theme here is recursion in programming, so I should focus on elements that capture the essence of recursion.
.....
# content

**Haiku on Recursion:**

Function calls itself,  
Base case breaks the looping chain—  
Stack grows, then falls back.

Another way is use <think></think>
The whole output would be like this:


<think>
Okay, I need to write a haiku about recursion in programming. Let's start by recalling what a haiku is. It's a three-line poem with a 5-7-5 syllable structure. The first line has five syllables, the second seven, and the third five again. The theme here is recursion in programming, so I should focus on elements that capture the essence of recursion.
.....
</think>

**Haiku on Recursion:**

Function calls itself,  
Base case breaks the looping chain—  
Stack grows, then falls back.

Which way do you prefer?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think neither? IMO it would be better to emit a separate item for reasoning. For example, I was trying something like this in #581. What do you think?

Copy link
Contributor Author

@Ddper Ddper Apr 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I agree with you.
I just found out openai-python 1.7.6 already added the types needed for emit reasoning content.

from openai.types.responses import (
    ResponseReasoningItem,
    ResponseReasoningSummaryTextDeltaEvent,
    ResponseReasoningSummaryPartAddedEvent,
    ResponseReasoningSummaryPartDoneEvent,
    ResponseReasoningSummaryTextDoneEvent
)

So we can emit ResponseReasoningSummaryTextDeltaEvent for reasoning content or create some class like ResponseReasoningTextDeltaEvent in this repo?
What do you think?

# Emit the reasoning content title
yield ResponseTextDeltaEvent(
content_index=state.text_content_index_and_output[0],
delta=reasoning_content_title,
item_id=FAKE_RESPONSES_ID,
output_index=0,
type="response.output_text.delta",
)
# Accumulate the text into the response part
state.text_content_index_and_output[1].text += reasoning_content_title

# Emit the delta for this segment of content
yield ResponseTextDeltaEvent(
content_index=state.text_content_index_and_output[0],
delta=reasoning_content,
item_id=FAKE_RESPONSES_ID,
output_index=0,
type="response.output_text.delta",
)
# Accumulate the text into the response part
state.text_content_index_and_output[1].text += reasoning_content

if content is not None:
if not emit_content and is_reasoning_model:
emit_content = True
content_title = "\n\n# content\n\n"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here?

# Emit the content title
yield ResponseTextDeltaEvent(
content_index=state.text_content_index_and_output[0],
delta=content_title,
item_id=FAKE_RESPONSES_ID,
output_index=0,
type="response.output_text.delta",
)
# Accumulate the text into the response part
state.text_content_index_and_output[1].text += content_title

# Emit the delta for this segment of content
yield ResponseTextDeltaEvent(
content_index=state.text_content_index_and_output[0],
delta=content,
item_id=FAKE_RESPONSES_ID,
output_index=0,
type="response.output_text.delta",
)
# Accumulate the text into the response part
state.text_content_index_and_output[1].text += content

# Handle refusals (model declines to answer)
if delta.refusal:
Expand Down