-
Notifications
You must be signed in to change notification settings - Fork 314
feat(agent): make structured output part of the agent loop #670
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
713d681
to
8bf015f
Compare
8bf015f
to
33bc6e8
Compare
33bc6e8
to
23c8cb2
Compare
I think this retry mechanism is a good idea. I've seen something similar used in a different context, and it was very effective. |
01f7dab
to
f2e46b9
Compare
"gen_ai.choice", | ||
attributes={"message": json.dumps(user.model_dump())}, | ||
) | ||
# Verify agent-level tracing was called |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the assertions have been weakened here. it is preferred to keep a high bar on assertions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what assertion would you like to call here?
events = self.model.structured_output(output_model, temp_messages, system_prompt=self.system_prompt) | ||
async for event in events: | ||
if "callback" in event: | ||
self.callback_handler(**cast(dict, event["callback"])) | ||
structured_output_span.add_event( | ||
"gen_ai.choice", attributes={"message": serialize(event["output"].model_dump())} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i am concerned that the tracing behavior has changed and hasn't been manually or unit tested
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok we can assert it further
|
||
with ThreadPoolExecutor() as executor: | ||
future = executor.submit(execute) | ||
return future.result() | ||
|
||
def _register_structured_output_tool(self, output_model: type[BaseModel]) -> Any: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's unclear to me why this tool is needed. the current structured output implementation within strands does utilize a tool to generate structured output (which is subject to change). however, i am hesitant to retry the entire agent loop in the pattern suggested in this PR. another way to implement this is returning the schema validation failures as tool failures to the model.
structured output is currently an open roadmap item that we are planning to redesign. i am not comfortable merging this PR in it's current state given that the underlying implementation is subject to change and due to the points raised above. currently, the approach of using a tool to generate structured output is an anti-pattern since most model providers natively support structured output response. these native approaches improve structure output performance substantially.
reference:
- https://docs.litellm.ai/docs/completion/json_mode
- https://platform.openai.com/docs/guides/structured-outputs
given these natively supported features, retries can be simply implemented on the actual model API request, a full retry of the agent loop is not required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
JSON output and TOOL use at the model level is likely an identical implementation to force the tokens into the json schema space. The performance of the models greatly improves in the agent loop context as it will have the context to evaluate its own work. Please see anthropic docs
This is an additional feature on top of native support. Think of it as the tool/json output at the model level forces the model to write in cursive, but doesn't actually check the contents of the writing. With a pydantic model you can have additional logical validations checking the content.
Please explain why is it an antipattern to use a tool for structured output? It re-suses the execution guardrails testing of tool execution where failures are fed back to the model. You would be reimplementing the same logic at the model structured output function level if you were to try to feedback any validation errors.
Under this setup the agent has the chance of performing all the actions as under the run command, and responding with a structured output, which is very useful for interacting with the output programatically.
however, i am hesitant to retry the entire agent loop in the pattern suggested in this PR. another way to implement this is returning the schema validation failures as tool failures to the model.
How is what you describe here different to rerunning the agent loop? if you feedback the schema failure as a tool result and then the agent responds that is effectively an agent loop with extra steps.
@@ -417,20 +426,33 @@ def structured_output(self, output_model: Type[T], prompt: Optional[Union[str, l | |||
output_model: The output model (a JSON schema written as a Pydantic BaseModel) | |||
that the agent will use when responding. | |||
prompt: The prompt to use for the agent (will not be added to conversation history). | |||
preserve_conversation: If False (default), restores original conversation state after execution. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the existence (or lack thereof) of the prompt parameter determines this behavior, so i'm not sure we want to modify this interface and add another parameter here.
Description
Title:
feat: Enable agent loop retry mechanism for structured output extraction
Description:
Summary
Modifies the
structured_output_async
method to leverage the agent's event loop retry mechanism, allowing the model toattempt structured data extraction multiple times until successful. Previously, the method used direct model inference
without retry capabilities.
Key Change: Agent Loop Integration
Before: Direct model call → single attempt → return or fail
Why This Matters
Implementation Approach
Behavioral Improvements
Example Scenario
Agent may now iterate like this internally:
1st attempt: Model provides incomplete data → validation fails → retry
2nd attempt: Model uses tools to gather missing info → validation fails → retry
3rd attempt: Model provides complete valid data → success!
profile = await agent.structured_output_async(UserProfile, prompt) -- Returns validated result after however many attempts needed
Files Modified
Backward Compatibility
Hybrid Approach: Structured output now attempts tool-based execution first, then falls back to the original
model.structured_output() method for backward compatibility
output execution modifies conversation history
Implementation Details:
Test Updates:
preserve_conversation=False
Backward Compatibility: All existing structured output functionality preserved while adding new tool-based capabilities
for future extensibility.
Function signature unchanged - existing code continues to work
Return type unchanged - still returns validated Pydantic models
Error handling improved - more robust failure modes
Related Issues
#348
Documentation PR
Type of Change
New feature
Testing
How have you tested the change? Verify that the changes do not break functionality or introduce warnings in consuming repositories: agents-docs, agents-tools, agents-cli
hatch run prepare
Checklist
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.