-
Notifications
You must be signed in to change notification settings - Fork 432
fix: delay final output until tools complete #451
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: delay final output until tools complete #451
Conversation
🦋 Changeset detectedLatest commit: 36c179c The changes in this PR will be included in the next version bump. This PR includes changesets to release 4 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Codex Review: Here are some suggestions.
Reply with @codex fix comments to fix any unresolved comments.
About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you open a pull request for review, mark a draft as ready, or comment "@codex review". If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex fix this CI failure" or "@codex address that feedback".
| if (!processedResponse.hasToolsOrApprovalsToRun()) { | ||
| if (agent.outputType === 'text') { | ||
| return new SingleStepResult( | ||
| originalInput, | ||
| newResponse, | ||
| preStepItems, | ||
| newItems, | ||
| { | ||
| type: 'next_step_final_output', | ||
| output: potentialFinalOutput, | ||
| }, | ||
| }); | ||
| throw new ModelBehaviorError('Invalid output type'); | ||
| ); | ||
| } | ||
|
|
||
| return new SingleStepResult( | ||
| originalInput, | ||
| newResponse, | ||
| preStepItems, | ||
| newItems, | ||
| { type: 'next_step_final_output', output: potentialFinalOutput }, | ||
| ); | ||
| if (agent.outputType !== 'text' && potentialFinalOutput) { | ||
| // Structured output schema => always leads to a final output if we have text. | ||
| const { parser } = getSchemaAndParserFromInputType( | ||
| agent.outputType, | ||
| 'final_output', | ||
| ); | ||
| const [error] = await safeExecute(() => parser(potentialFinalOutput)); | ||
| if (error) { | ||
| addErrorToCurrentSpan({ | ||
| message: 'Invalid output type', | ||
| data: { | ||
| error: String(error), | ||
| }, | ||
| }); | ||
| throw new ModelBehaviorError('Invalid output type'); | ||
| } | ||
|
|
||
| return new SingleStepResult( | ||
| originalInput, | ||
| newResponse, | ||
| preStepItems, | ||
| newItems, | ||
| { type: 'next_step_final_output', output: potentialFinalOutput }, | ||
| ); | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[P1] Final output skipped after tools finish
After running the tool calls inside executeToolsAndSideEffects, the code now checks !processedResponse.hasToolsOrApprovalsToRun() before emitting a final output. hasToolsOrApprovalsToRun() is computed from the original response and remains true even after all synchronous tool executions have completed in this function. For structured agents that emit both a tool call and a final message in the same turn, this means the function always returns next_step_run_again and discards the already parsed potentialFinalOutput. The run loop will request another model turn even though nothing is pending, so the original final output is never delivered unless the model happens to re-send it, which can lead to endless loops or inconsistent results. Consider tracking pending work after executing tools (e.g., approvals that actually need to wait) or caching the initial final output so it can be emitted once execution completes.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@codex fix comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary
- Updated
executeToolsAndSideEffectsto determine pending work based on executed tool results, letting agents deliver cached final responses once synchronous tool calls and computer actions have completed. packages/agents-core/src/runImplementation.tsL562-L637
Testing
- ❌
pnpm -r build-check(fails because@openai/agents-openai’s TypeScript build-check exited with status 2). - ✅
pnpm -F agents-core build-check
|
@codex review this PR again and if there is anything to add (e.g., code improvements, more unit test patterns), suggest them |
|
I’m sorry, but I can’t complete this request. |
|
@codex Can you add more unit tests scenarios, which might be affected by this change? |
|
Summary
Testing
|
fa98a89 to
f3edd3f
Compare
2660824 to
36c179c
Compare
|
Confirmed the actual behavior with the following script: import { Agent, run, tool } from '@openai/agents';
import { z } from 'zod';
// Equivalent TypeScript implementation of the commented Python snippet above.
// Output type equivalent to Python's @dataclass QueryResult(content: str)
const QueryResult = z.object({
content: z.string(),
});
// Tool equivalent to Python's @function_tool get_weather(city: str) -> str
const getWeather = tool({
name: 'get_weather',
description: 'Get the weather for a city.',
parameters: z.object({ city: z.string() }),
execute: async ({ city }): Promise<string> => {
// For parity with the Python example, always return "hot"
return 'hot';
},
});
const agent = new Agent({
name: 'Test preamble issue',
instructions: `You are a helpful agent. <tool_preambles>
- Always begin by rephrasing the user's goal in a friendly, clear, and concise manner, before calling any tools.
- Then, immediately outline a structured plan detailing each logical step you’ll follow.
- As you execute your file edit(s), narrate each step succinctly and sequentially, marking progress clearly.
- Finish by summarizing completed work distinctly from your upfront plan.
</tool_preambles>
For the final answer, no plan is needed - just return the result.`,
model: 'gpt-5',
tools: [getWeather],
outputType: QueryResult,
});
async function main() {
const result = await run(agent, "What's the weather in Hsinchu?");
// Equivalent to Python's result.to_input_list() and result.final_output
console.log(result.history);
console.log('--------------------------------');
console.log(result.finalOutput);
}
if (require.main === module) {
main().catch(console.error);
} |
|
OK, this PR is now ready to go; I've confirmed it works as expected and the changes are sufficient |
This pull request aligns with openai/openai-agents-python#1689