-
Notifications
You must be signed in to change notification settings - Fork 220
Fix duplicate tool_result error from Anthropic API #2256
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Changes from 2 commits
Commits
Show all changes
8 commits
Select commit
Hold shift + click to select a range
3a88685
Fix duplicate tool_result error from Anthropic API
openhands-agent a48436a
Merge AgentErrorEvent content into ObservationEvent instead of select…
openhands-agent a50cbf6
Address PR review comments for tool result uniqueness
openhands-agent 029ebc4
Limit tool result uniqueness handling to consecutive duplicates only
openhands-agent d66d74d
Merge branch 'main' into fix-duplicate-tool-result-error
xingyaoww b7b1e6b
Simplify: move duplicate observation deduplication to events_to_messages
openhands-agent b75c6e3
Remove unnecessary logger from base.py and document SDK logger prefer…
openhands-agent b99b0bd
Merge duplicate observation content instead of selecting best
openhands-agent File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
188 changes: 188 additions & 0 deletions
188
openhands-sdk/openhands/sdk/context/view/properties/tool_result_uniqueness.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,188 @@ | ||
| """Property to ensure each tool_call_id has exactly one tool result. | ||
|
|
||
| LLM APIs (especially Anthropic) require that each tool_use has exactly one | ||
| corresponding tool_result. This property handles the edge case where multiple | ||
| observation events (e.g., AgentErrorEvent and ObservationEvent) are created | ||
| for the same tool_call_id, typically due to restarts or race conditions. | ||
|
|
||
| When both AgentErrorEvent and ObservationEvent exist for the same tool_call_id, | ||
| the error context is merged into the observation to preserve both pieces of | ||
| information for the LLM. | ||
| """ | ||
|
|
||
| from collections import defaultdict | ||
| from collections.abc import Sequence | ||
|
|
||
| from openhands.sdk.context.view.manipulation_indices import ManipulationIndices | ||
| from openhands.sdk.context.view.properties.base import ViewPropertyBase | ||
| from openhands.sdk.event import ( | ||
| AgentErrorEvent, | ||
| Event, | ||
| EventID, | ||
| LLMConvertibleEvent, | ||
| ObservationBaseEvent, | ||
| ObservationEvent, | ||
| ToolCallID, | ||
| ) | ||
| from openhands.sdk.llm import TextContent | ||
|
|
||
|
|
||
| def _create_merged_observation( | ||
| obs_event: ObservationEvent, | ||
| error_events: list[AgentErrorEvent], | ||
| ) -> ObservationEvent: | ||
| """Create a new ObservationEvent with error context merged into the observation. | ||
|
|
||
| The error messages from AgentErrorEvents are prepended to the observation content, | ||
| giving the LLM context about any issues that occurred during tool execution. | ||
|
|
||
| Args: | ||
| obs_event: The original ObservationEvent with the actual tool result. | ||
| error_events: List of AgentErrorEvents to merge (typically from restarts). | ||
|
|
||
| Returns: | ||
| A new ObservationEvent with merged content and a new unique ID. | ||
| """ | ||
| # Collect error messages | ||
| error_texts = [f"[Note: {error.error}]" for error in error_events] | ||
| error_prefix = "\n".join(error_texts) + "\n\n" | ||
|
|
||
| # Create new content list with error context prepended | ||
| original_content = list(obs_event.observation.content) | ||
| merged_content: list[TextContent] = [TextContent(text=error_prefix)] | ||
| merged_content.extend(original_content) # type: ignore[arg-type] | ||
|
xingyaoww marked this conversation as resolved.
Outdated
|
||
|
|
||
| # Create a new observation with merged content | ||
| # We need to preserve all fields from the original observation | ||
| obs_data = obs_event.observation.model_dump() | ||
| obs_data["content"] = merged_content | ||
|
|
||
| # Create the new observation using the same class as the original | ||
| merged_observation = obs_event.observation.__class__(**obs_data) | ||
|
xingyaoww marked this conversation as resolved.
Outdated
|
||
|
|
||
| # Create new ObservationEvent with a unique ID | ||
| # ID format: "{original_id}-merged" to ensure uniqueness | ||
| return ObservationEvent( | ||
| id=f"{obs_event.id}-merged", | ||
|
xingyaoww marked this conversation as resolved.
Outdated
|
||
| tool_name=obs_event.tool_name, | ||
| tool_call_id=obs_event.tool_call_id, | ||
| observation=merged_observation, | ||
| action_id=obs_event.action_id, | ||
| source=obs_event.source, | ||
| ) | ||
|
|
||
|
|
||
| class ToolResultUniquenessProperty(ViewPropertyBase): | ||
| """Each tool_call_id must have exactly one tool result. | ||
|
|
||
| When multiple observations exist for the same tool_call_id, this property: | ||
| 1. Merges AgentErrorEvent content into ObservationEvent (if both exist) | ||
| 2. Keeps only the merged/primary event and removes duplicates | ||
| 3. Prefers ObservationEvent > other observations > AgentErrorEvent | ||
| """ | ||
|
|
||
| def transform( | ||
| self, | ||
| current_view_events: list[LLMConvertibleEvent], | ||
| all_events: Sequence[Event], # noqa: ARG002 | ||
| ) -> dict[EventID, LLMConvertibleEvent]: | ||
| """Merge AgentErrorEvent content into ObservationEvent when both exist. | ||
|
|
||
| When an AgentErrorEvent and ObservationEvent share the same tool_call_id | ||
| (typically from a restart scenario), merge the error context into the | ||
| observation so the LLM has full context about what happened. | ||
| """ | ||
| # Group observations by tool_call_id | ||
| observations_by_tool_call: dict[ToolCallID, list[ObservationBaseEvent]] = ( | ||
| defaultdict(list) | ||
| ) | ||
|
|
||
| for event in current_view_events: | ||
| if isinstance(event, ObservationBaseEvent): | ||
| observations_by_tool_call[event.tool_call_id].append(event) | ||
|
|
||
| transforms: dict[EventID, LLMConvertibleEvent] = {} | ||
|
|
||
| for observations in observations_by_tool_call.values(): | ||
| if len(observations) <= 1: | ||
| continue | ||
|
|
||
| # Find ObservationEvents and AgentErrorEvents | ||
| obs_events = [o for o in observations if isinstance(o, ObservationEvent)] | ||
| error_events = [o for o in observations if isinstance(o, AgentErrorEvent)] | ||
|
|
||
| # Only merge if we have both an ObservationEvent and AgentErrorEvent(s) | ||
| if obs_events and error_events: | ||
| # Use the last ObservationEvent as the base | ||
| base_obs = obs_events[-1] | ||
| # Create merged observation with error context | ||
| merged_event = _create_merged_observation(base_obs, error_events) | ||
| transforms[base_obs.id] = merged_event | ||
|
|
||
| return transforms | ||
|
|
||
| def enforce( | ||
| self, | ||
| current_view_events: list[LLMConvertibleEvent], | ||
| all_events: Sequence[Event], # noqa: ARG002 | ||
| ) -> set[EventID]: | ||
| """Remove duplicate tool results for the same tool_call_id. | ||
|
|
||
| After transform() has merged error context into observations, this method | ||
| removes the remaining duplicate events (the original AgentErrorEvents and | ||
| any other duplicates). | ||
| """ | ||
| # Group observations by tool_call_id | ||
| observations_by_tool_call: dict[ToolCallID, list[ObservationBaseEvent]] = ( | ||
| defaultdict(list) | ||
| ) | ||
|
|
||
| for event in current_view_events: | ||
| if isinstance(event, ObservationBaseEvent): | ||
| observations_by_tool_call[event.tool_call_id].append(event) | ||
|
xingyaoww marked this conversation as resolved.
Outdated
|
||
|
|
||
| events_to_remove: set[EventID] = set() | ||
|
|
||
| for observations in observations_by_tool_call.values(): | ||
| if len(observations) <= 1: | ||
| continue | ||
|
|
||
| # Multiple observations for same tool_call_id - need to pick one | ||
| # Priority: ObservationEvent > other observations > AgentErrorEvent | ||
| # If same priority, keep the later one (more recent) | ||
| obs_events = [o for o in observations if isinstance(o, ObservationEvent)] | ||
| error_events = [o for o in observations if isinstance(o, AgentErrorEvent)] | ||
| other_events = [ | ||
| o | ||
| for o in observations | ||
| if not isinstance(o, (ObservationEvent, AgentErrorEvent)) | ||
| ] | ||
|
|
||
| # Determine which one to keep | ||
| if obs_events: | ||
| # Keep the last ObservationEvent, remove all others | ||
| to_keep = obs_events[-1] | ||
| elif other_events: | ||
| # Keep the last "other" event (UserRejectObservation, etc.) | ||
| to_keep = other_events[-1] | ||
| else: | ||
| # Only AgentErrorEvents, keep the last one | ||
| to_keep = error_events[-1] | ||
|
|
||
| # Mark all others for removal | ||
| for obs in observations: | ||
| if obs.id != to_keep.id: | ||
| events_to_remove.add(obs.id) | ||
|
|
||
| return events_to_remove | ||
|
|
||
| def manipulation_indices( | ||
| self, | ||
| current_view_events: list[LLMConvertibleEvent], | ||
| ) -> ManipulationIndices: | ||
| """Calculate manipulation indices for tool result uniqueness. | ||
|
|
||
| This property doesn't restrict manipulation - it only enforces | ||
| uniqueness when violations are detected. | ||
| """ | ||
| return ManipulationIndices.complete(current_view_events) | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@OpenHands review this PR, is there a way to do this WITHOUT adding this new transform fucntion?
I remember there's a util function that transform the full list of events to LLM messages, could we do it there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm on it! xingyaoww can track my progress at all-hands.dev
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary
I addressed the review comment by simplifying the implementation as suggested. The changes have already been pushed (commit
b7b1e6b3).What Changed
Instead of adding a new
transform()method toViewPropertyBaseand a newToolResultUniquenessPropertyclass, I moved the duplicate observation deduplication directly into the existingevents_to_messages()function.Files Changed
openhands-sdk/openhands/sdk/event/base.py- Added logic toevents_to_messages()to deduplicate consecutive observations with the sametool_call_id, plus a helper function_select_best_observation()tests/sdk/event/test_events_to_messages.py- AddedTestDuplicateObservationDeduplicationtest class with 7 tests covering the deduplication scenariosReverted/Removed:
openhands-sdk/openhands/sdk/context/view/properties/base.py- reverted (notransform()method)openhands-sdk/openhands/sdk/context/view/properties/__init__.py- revertedopenhands-sdk/openhands/sdk/context/view/view.py- revertedopenhands-sdk/openhands/sdk/context/view/properties/tool_result_uniqueness.py- deletedtests/sdk/context/view/properties/test_tool_result_uniqueness.py- deletedWhy This Approach Is Better
events_to_messages()already batchesActionEvents with the samellm_response_id- same pattern now handles observation deduplicationtransform()method added toViewPropertyBase