fix(rewoo): prevent plan() and aplan() replay from mutating cached tool_calls by BEASTSHRIRAM · Pull Request #192 · mesa/mesa-llm

BEASTSHRIRAM · 2026-03-12T10:46:13Z

Summary

Fixes a silent tool replay bug in ReWOOReasoning.plan() and aplan() where multi-step tool call replays mutate self.current_plan.tool_calls in-place, causing the agent to silently replay incorrect tool steps on subsequent calls. The fix stores the original tool calls separately and returns a copy instead of mutating the cached plan.

Bug / Issue

Fixes #184 (related to async memory issue, but exposes a broader sync/async mutation bug)

In a complex reasoning task using ReWOOReasoning (e.g., the negotiation example
where an agent plans multiple steps ahead), when a plan has 3+ tool calls:

First replay call: remaining_tool_calls=3, len(current_plan.tool_calls)=3, index = 3-3 = 0 ✓ Correct tool returned
First replay mutates: current_plan.tool_calls = [tool_0] (now length 1)
Second replay call: remaining_tool_calls=2, len(current_plan.tool_calls)=1, index = 1-2 = -1 ✗ Negative index wrong tool
Subsequent calls: Continue with corrupted indices, agent replays wrong tool step

The mutation happens because both branches of the early-return path used a mutable alias:

# BEFORE (buggy): mutation in both plan() and aplan()
current_plan = self.current_plan
current_plan.tool_calls = tool_call  # ← mutates the cached original

No error is raised. The agent silently replays the wrong tool, leading to semantic failures
in tasks requiring multi-step plans (negotiation, complex reasoning, sequential tool use).

Implementation

Added self._all_tool_calls = [] to __init__() to store the original tool calls list separately
When a new plan is generated, store a copy: self._all_tool_calls = list(rewoo_plan.llm_plan.tool_calls)
Use the original list for index calculation: index_of_tool = len(self._all_tool_calls) - self.remaining_tool_calls

Return a copy of the plan with only the required tool call, leaving the original untouched:

temp_plan = copy.copy(self.current_plan)
temp_plan.tool_calls = [self.current_plan.tool_calls[index_of_tool]]
return Plan(llm_plan=temp_plan, ...)

Applied to both plan() and aplan() methods (identical early-return logic in both)

# _all_tool_calls decouples indexing from mutations
# temp_plan decouples return value from original
# both prevent silent failures on subsequent replays

Testing

All 20 ReWOO reasoning tests pass, including:

test_plan_with_remaining_tool_calls() verifies correct tool is returned
test_aplan_with_remaining_tool_calls() async path works correctly
test_remaining_tool_calls_decrement() index calculation across multiple scenarios

Tests confirm that each replay returns the correct tool from the full plan without
corrupting self.current_plan.tool_calls.

Additional Notes

This is a silent correctness bug in ReWOOReasoning, the multi-step planning strategy.
Any agent using ReWOO with plans containing 2+ tool calls is affected. The failure is
invisible no exception is raised, the agent simply replays wrong tool steps, causing
task failures that are hard to debug.

No API changes. No new dependencies. The fix adds one import copy and decouples state
management in two methods. Fixes both sync and async paths with identical logic.

…ol_calls

for more information, see https://pre-commit.ci

coderabbitai · 2026-03-12T10:46:24Z

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 87517bf0-2b28-4055-b508-0289e496e439

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

📝 Coding Plan for PR comments

Generate coding plan

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

mesa_llm/reasoning/rewoo.py

…is not yet populated

BEASTSHRIRAM and others added 2 commits March 12, 2026 16:09

fix(rewoo): prevent plan() and aplan() replay from mutating cached to…

d23af88

…ol_calls

[pre-commit.ci] auto fixes from pre-commit.com hooks

a824c90

for more information, see https://pre-commit.ci

souro26 reviewed Mar 12, 2026

View reviewed changes

mesa_llm/reasoning/rewoo.py Outdated Show resolved Hide resolved

fix(reewo): Fallback to current_plan.tool_calls when _all_tool_calls …

a748169

…is not yet populated

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(rewoo): prevent plan() and aplan() replay from mutating cached tool_calls#192

fix(rewoo): prevent plan() and aplan() replay from mutating cached tool_calls#192
BEASTSHRIRAM wants to merge 3 commits intomesa:mainfrom
BEASTSHRIRAM:fix/rewoo-mutable-alias

BEASTSHRIRAM commented Mar 12, 2026

Uh oh!

coderabbitai bot commented Mar 12, 2026 •

edited

Loading

Review skipped

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

BEASTSHRIRAM commented Mar 12, 2026

Summary

Bug / Issue

Implementation

Testing

Additional Notes

Uh oh!

coderabbitai bot commented Mar 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

coderabbitai bot commented Mar 12, 2026 •

edited

Loading