feat : Structured llm reasoning added for auditable and constrained decision making#188
feat : Structured llm reasoning added for auditable and constrained decision making#188
Conversation
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@EwoutH @wang-boyu @khushiiagrawal Please review and give some feedbacks. |
mesa_llm/reasoning/decision.py
Outdated
| response_format=DecisionOutput, | ||
| ) | ||
|
|
||
| formatted_response = json.loads(rsp.choices[0].message.content) |
There was a problem hiding this comment.
response_format=DecisionOutput is already passed to the llm call. Parsing json again here ties this code to the raw response format. It might be cleaner to rely on the wrapper's structured output instead.
There was a problem hiding this comment.
Thankyou for the review !!
I will revert back after doing necessary changes .
There was a problem hiding this comment.
@souro26 I have tried to resolve your flagged issues.
Please review !!
mesa_llm/reasoning/decision.py
Outdated
| """ | ||
|
|
||
| def get_decision_prompt(self, obs: Observation) -> list[str]: | ||
| prompt_list = [self.agent.memory.get_prompt_ready()] |
There was a problem hiding this comment.
This assumes the memory backend always has get_prompt_ready and get_communication_history. Adding a small guard here would make this reasoning class safer with other memory implementations.
There was a problem hiding this comment.
Thankyou for the review !!
I will revert back after doing necessary changes .
There was a problem hiding this comment.
@souro26 , I have tried to resolve the issues !!
Please review these changes !!
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #188 +/- ##
==========================================
+ Coverage 90.64% 90.85% +0.20%
==========================================
Files 19 20 +1
Lines 1540 1673 +133
==========================================
+ Hits 1396 1520 +124
- Misses 144 153 +9 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
for more information, see https://pre-commit.ci
Summary
This PR adds a new structured reasoning strategy,
DecisionReasoning, on thecotbranch.The change does not modify the existing
CoTReasoningimplementation. Instead, it introduces a separate reasoning module for cases where the model should make decisions through a strict, machine-readable schema rather than free-form chain-of-thought text.The new reasoning path:
next_actionWhat This PR Adds
New reasoning module
Added:
mesa_llm/reasoning/decision.pyThis module defines:
DecisionOptionDecisionOutputDecisionReasoningNew reasoning schema
The model is required to return a strict JSON object containing:
goalconstraintsknown_factsunknownsassumptionsoptionschosen_optionrationaleconfidencerisksnext_actionEach option also contains:
namedescriptiontradeoffsscoreValidation
The structured response is validated through Pydantic before execution.
This gives:
confidencein the range0.0to1.0Execution flow
The existing tool execution pipeline is preserved.
The flow is now:
next_actionnext_actionthrough the existing tool executorThis means the executor no longer depends on a long reasoning blob for this reasoning mode.
Memory integration
The decision artifact is stored as:
type="decision"This makes the reasoning result easier to inspect and supports future analysis of:
Tests
Added:
tests/test_reasoning/test_decision.pyThe test coverage includes:
next_actionWhat This PR Does Not Change
This PR does not modify:
mesa_llm/reasoning/cot.pyCoTReasoningReActReasoningReWOOReasoningSo while this branch is named
cot, the latest change is not an update to the existing CoT implementation. It is a new alternative reasoning mode focused on structured decision-making.Why This Change
The current reasoning styles in the repo support:
What was missing was a reasoning mode that explicitly separates:
This is useful when decision quality and inspectability matter more than verbose reasoning prose.
Example Output
{ "goal": "Reach food", "constraints": ["One move this turn"], "known_facts": ["Food is visible to the east"], "unknowns": ["Whether another agent will block the path"], "assumptions": ["The east cell remains traversable this step"], "options": [ { "name": "move_east", "description": "Move toward visible food", "tradeoffs": ["Fast progress", "Potential contention"], "score": 0.88 } ], "chosen_option": "move_east", "rationale": "It best advances the goal with acceptable risk.", "confidence": 0.78, "risks": ["Another agent may reach the food first"], "next_action": "move_east" }Only
next_actionis forwarded to the executor.Validation Performed
The following checks were run:
Results:
reasoning tests passedRuff :
passedonly an existing upstream Mesa deprecation warning was observed during pytest
Files Added :
mesa_llm/reasoning/decision.py tests/test_reasoning/test_decision.pyNotes For Reviewers
This PR should be reviewed as:
a new reasoning capability
a structured alternative to free-form CoT
a non-breaking addition to the current reasoning system
It should not be interpreted as an enhancement to the existing CoTReasoning class itself.