-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Description
Problem Summary:
The user asked the assistant to check inbox and sort up to 20 tasks. The assistant called flexus_bot_kanban() to list the board, saw 1 item in the Inbox, and replied '1 tasks sorted' but did not call flexus_bot_kanban(op="inbox_to_todo") or assign the task to this chat. From the user's perspective the assistant did not actually sort or move tasks despite claiming it did.
Evidence:
- thread_id: ISb03AGX0E (from thread.json).
- messages.json shows a tool call: assistant (ftm_num 3) called flexus_bot_kanban() with no args and received the board contents (message_100_4.txt).
- message_100_4.txt (tool response) shows Inbox contained: [{"id": "6pknVzcO4B", "title": "Interview user on cloud AI DevOps product idea"}].
- The assistant final reply (ftm_num 5) is '1 tasks sorted' but there are no follow-up tool calls such as inbox_to_todo or assign_to_this_chat in messages.json or logs.
- thread.json ft_title indicates this was a scheduled sort task: 'SCHED_TASK_SORT EVERY:1m ...' suggesting the assistant is expected to autonomously move items.
Root cause analysis:
The assistant listed the kanban board but did not implement the subsequent action to move the Inbox item(s) into Todo or assign them to the current chat. Possible causes include:
- Missing decision logic to choose which item(s) to move and which operation to call after listing the inbox (no auto-selection implemented for single unambiguous items).
- The assistant incorrectly assumed that listing the inbox counts as sorting, and responded with the summary message without performing side-effectful calls.
- Lack of robust checks: no verification that a tool op that performs movement was called before reporting 'N tasks sorted'.
Recommended fix:
- After flexus_bot_kanban() returns an Inbox list, if there is exactly 1 item, the assistant should call flexus_bot_kanban(op="inbox_to_todo", args={"join": [<task_id>]}) or flexus_bot_kanban(op="assign_to_this_chat", args={"batch": [<task_id>]}) depending on policy, then reply '1 tasks sorted'.
- Add a verification step before replying: only say 'N tasks sorted' if the move/assign tool call succeeded and is present in the conversation history.
- Add unit tests / e2e tests to simulate scheduled runs that check the assistant performs side-effectful operations when expected.
If this is a duplicate of an existing issue, link it here.
Metadata
Metadata
Assignees
Labels
No labels