-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
What happened?
[Bug] Agent executes redundant/duplicate tool calls in a single step
Description
The Trae Agent currently executes all tool calls returned by the LLM without checking for duplicates. In some scenarios, particularly when the model is struggling or hallucinating, it may generate multiple identical tool calls within a single step (e.g., running the same grep command 5 times).
This behavior leads to significant inefficiencies and system instability.
Observed Behavior
During an evaluation run on SWE-bench_Verified (instance astropy__astropy-13453), the agent exhibited the following behavior in a single step (Step 22):
- Redundant
grepcommands: The agent issuedgrep -n "formats" /testbed/astropy/table/table.pymultiple times. - Redundant
viewcommands: The agent requested to view the same file ranges repeatedly. - Excessive Tool Calls: The step contained over 200 tool calls, many of which were duplicates.
Example Log Snippet:
{
"tool_calls": [
{
"name": "bash",
"arguments": { "command": "grep -n \"formats\" /testbed/astropy/table/table.py" }
},
{
"name": "bash",
"arguments": { "command": "grep -n \"formats\" /testbed/astropy/table/table.py" }
},
...
]
}Impact
- Token Wastage: The context window becomes bloated with redundant tool results. In the observed case, input tokens spiked to ~44,000.
- Performance Degradation: The massive context size degrades the model's reasoning capabilities for subsequent steps.
- System Instability: Executing hundreds of processes (e.g.,
bashcommands) simultaneously caused the Docker container to hang and the commands to timeout (Error running bash command: timed out).
Proposed Solution
Implement a deduplication mechanism in the _tool_call_handler method within trae_agent/agent/base_agent.py.
Logic:
Before executing the list of tool_calls, iterate through them and filter out any call that has an identical name and arguments to a previously seen call in the current batch.
Implementation Draft:
# In trae_agent/agent/base_agent.py
async def _tool_call_handler(self, tool_calls: list[ToolCall] | None, step: AgentStep) -> list[LLMMessage]:
# Deduplicate tool calls
if tool_calls:
unique_tool_calls = []
seen_calls = set()
for call in tool_calls:
# Create a hashable signature
try:
call_signature = (call.name, json.dumps(call.arguments, sort_keys=True))
except (TypeError, ValueError):
call_signature = (call.name, str(call.arguments))
if call_signature not in seen_calls:
seen_calls.add(call_signature)
unique_tool_calls.append(call)
tool_calls = unique_tool_calls
# Proceed with execution...This ensures that even if the LLM requests the same action 10 times, it is only executed once, saving tokens and preventing system overload.
Especially in the case of smaller models, this issue is prominent. I can implement the suggested change and test and raise a PR on the same
What did you expect to happen?
I expected it to not call tools redundantly which may bloat the token usage for the next calls.
Traceback
No response
What is your system, Python, dependency version?
No response
Additional information that you believe is relevant to this bug
No response