You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
🤖 fix: Remove premature commitment from todo_write description (#471)
Fixes premature commitment issues in both `todo_write` and `status_set`
tool descriptions.
## Problem
Both tool descriptions contained prescriptive language that pressured
agents to commit to narratives before validating outcomes:
### `todo_write` issues:
- "Use this for ALL complex, multi-step plans" - forced usage before
validation
- "Before finishing your response, ensure all todos are marked as
completed" - pressured false completions
- "Update frequently as work progresses" - created distraction from
errors
- Complex structural guidance about old/recent/current/immediate/far
future work - added cognitive load
### `status_set` issue:
- "Set a final status before completing that reflects the outcome" -
ambiguous timing allowed premature success claims
## Changes
### `todo_write`:
**Removed:**
- ❌ "Use this for ALL complex, multi-step plans"
- ❌ "Before finishing your response, ensure all todos are marked as
completed"
- ❌ "Update frequently"
- ❌ Complex structural guidance
**Added:**
- ✅ "The TODO list is displayed to the user at all times" -
contextualizes importance
- ✅ "If you hit the 7-item limit, summarize older completed items into
one line" - brings back useful guidance without rigid structure
- ✅ "If work fails or approach changes, update the list to reflect
reality" - explicit permission to show failures
- ✅ "Only mark tasks complete when they actually succeed" - clear
expectation
**Kept:**
- ONE in_progress at a time
- Ordering (completed first, in_progress, pending last)
- Tense guidelines (past/present progressive/imperative)
### `status_set`:
**Changed:**
- ✅ "Set a final status after completion, only claim success when
certain (e.g., after confirming checks passed)"
This makes the sequencing explicit: do work → verify outcome → set final
status.
## Impact
Agents should:
- Validate approach before creating TODOs
- Update TODOs to reflect failures, not hide them
- Only claim success after verifying outcomes
- Focus on actual work rather than tool maintenance
## Testing
No code logic changes - these are prompt/description updates. Will
observe agent behavior in real use.
_Generated with `cmux`_
0 commit comments