Skip to content

Conversation

@ihower
Copy link
Contributor

@ihower ihower commented Sep 8, 2025

Issue

When using structured outputs, if the model produces both a message and a function call in the same turn,
the agent would terminate prematurely.

This behavior has become much easier to reproduce with GPT-5 tool calling + preambles.
Earlier models could also occasionally produce such outputs. For example, in GPT-4.1, if the prompt instructed the model to think before calling a tool—making it possible to encounter random failures whenever output_type was specified.

I think this fix should also resolve openai/openai-agents-python#1061.

Reproducible Example

from dataclasses import dataclass
from agents import Agent, Runner, function_tool

@dataclass
class QueryResult:
    content: str    

@function_tool
def get_weather(city: str) -> str:
    return "hot"

agent = Agent(
    name="Test preamble issue",
    instructions="""You are a helpful agent. <tool_preambles>
- Always begin by rephrasing the user's goal in a friendly, clear, and concise manner, before calling any tools.
- Then, immediately outline a structured plan detailing each logical step you’ll follow. 
- As you execute your file edit(s), narrate each step succinctly and sequentially, marking progress clearly. 
- Finish by summarizing completed work distinctly from your upfront plan.
</tool_preambles>
For the final answer, no plan is needed - just return the result.
""",
    model="gpt-5",
    tools=[get_weather],
    output_type=QueryResult,
)

result = Runner.run_sync(agent, input="What's the weather in Hsinchu?")

print(result.to_input_list())
print("--------------------------------")
print(result.final_output)

Example Output (before this fix)

The to_input_list() looks like:

[
  {"content": "What's the weather in Hsinchu?", "role": "user"},
  {"id": "rs_...", "summary": [], "type": "reasoning"},
  {"id": "msg_...", "content": [...{"text": "Your goal: Get the current weather for Hsinchu.\nPlan:\n1) Query the weather service for Hsinchu.\n2) Return the current conditions and temperature briefly.\nProgress:\n- Step 1: Querying the weather service for Hsinchu now."}], "role": "assistant", "type": "message"},
  {"arguments": '{"city":"Hsinchu"}', "call_id": "call_...", "name": "get_weather", "type": "function_call"},
  {"call_id": "call_...", "output": "hot", "type": "function_call_output"}
]

At this point, the list ends with the tool call output. The run loop does not pass this tool output back to the LLM for another turn. As a result, the agent finalizes prematurely, treating the preamble message as the structured output instead of producing a final answer that incorporates the tool result:

# final_output
QueryResult(content='Your goal: Get the current weather for Hsinchu.\nPlan:\n1) Query the weather service for Hsinchu. \n2) Return the current conditions and temperature briefly. \nProgress:\n- Step 1: Querying the weather service for Hsinchu now.')

Behavior Change

Before:
When a structured output schema was used, if the model produced both a preamble message and a function call in the same turn, the agent would treat the preamble message as the final output, even though tool calls were still pending.
This caused the run loop to terminate early.

After:
Structured output only triggers a final output when there are no pending tool calls or approvals.
If the model emits a tool call, we process it, append the result, and re-run the loop.

I think this aligns with the default tool_use_behavior=run_llm_again, which is more intuitive.
If developers prefer to treat function call output as the final output, they can explicitly configure tool_use_behavior to stop_on_first_tool or StopAtTools.

✅ All tests are passing

Example Output (after this fix)

The list now continues after the tool call output, feeding the tool result back to the LLM and producing the proper final answer:

[
  {"content": "What's the weather in Hsinchu?", "role": "user"},
  {"id": "rs_...", "summary": [], "type": "reasoning"},
  {"id": "msg_...", "content": [...{"text": "Goal: You want to know the current weather in Hsinchu.\n\nPlan:\n1) Query the weather service for Hsinchu. 2) Wait for the response. 3) Present the current weather details succinctly.\n\nProgress:\n- Step 1: Sending request to the weather service for Hsinchu now."}], "role": "assistant", "type": "message"},
  {"arguments": '{"city":"Hsinchu"}', "name": "get_weather", "type": "function_call"},
  {"call_id": "...", "output": "hot", "type": "function_call_output"},
  {"id": "msg_...", "content": [...{"text": "Progress:\n- Step 2: Received the weather data.\n- Step 3: Preparing the concise result for you now."}], "role": "assistant", "type": "message"},
  {"id": "msg_...", "content": [...{"text": "It’s currently hot in Hsinchu."}], "role": "assistant", "type": "message"}
]

Note: after a function call output, the model may possibly produce a post-message (e.g. explaining that it received the weather data) before generating the final output message in the same turn.

And the final output now correctly incorporates the tool result:

# final_output
QueryResult(content='It’s currently hot in Hsinchu.')

…nding

tests: update runner/streamed expectations to reflect deferred finalization
@seratch seratch added feature:core bug Something isn't working labels Sep 8, 2025
Copy link
Member

@seratch seratch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for sending this PR. Looks good to me

@seratch seratch merged commit 5a9cab8 into openai:main Sep 9, 2025
5 checks passed
@seratch
Copy link
Member

seratch commented Sep 9, 2025

@codex Can you check if this PR should have the same change with openai/openai-agents-js#452 ?

@chatgpt-codex-connector
Copy link

@codex Can you check if this PR should have the same change with openai/openai-agents-js#452 ?

For now, I can only help with PRs you've created.

@seratch
Copy link
Member

seratch commented Sep 9, 2025

I've checked the above possibility, but for Python SDK, we don't need any further changes here.

@javi-horizon
Copy link

when will this be available via the python open ai agents sdk, i have encountered many errors

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working feature:core

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Agent run with previous_response_id fails - No tool output found for function call call_WdnUUKXKvwy3jk....

3 participants