subtasks alpha version (still in development) #1123

shaybc · 2025-02-22T22:05:00Z

Description

added a feature to allow task to create a sub-tasks that can be executed in a separated way without dependence on the main task or previous tasks history, by that reducing the context size with the ai, avoiding forgetfulness, confusion and hallucinations by the llm,

this is wtill a WIP, haven't succeeded the llm to understand that the task has been performed by the sub-task, and for some reason it insist on re-executing the task again in the main/parent task, even when given the exact task description as a successfully done.

Type of change

new Feature

Bug fix (non-breaking change which fixes an issue)
[ X] New feature
Breaking change (fix or feature that would cause existing functionality to not work as expected)
This change requires a documentation update

How Has This Been Tested?

locally, haven't really tested yet since the feature is not fully working
created a file called: hello_world.py

used a prompt:

start new_task and change text "hello all" to "hello world"

then start new_task and add another print that will print the words "thank you"

tried openAI, Gemini, Ollama models.

Checklist:

My code follows the patterns of this project
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation

Additional context

Related Issues

Reviewers

@mrubens

Important

Adds subtask management to Cline class, allowing independent task execution and updates related tests and API handling.

Behavior:
- Introduces subtasks in Cline.ts with isSubTask and isPaused flags.
- Adds setSubTask(), resumePausedTask(), and waitForResume() methods in Cline.ts.
- Updates ClineProvider.ts to manage a stack of Cline instances for task/subtask management.
- Modifies registerCommands.ts to use removeClineFromStack() instead of clearTask().
Testing:
- Updates ClineProvider.test.ts to test new subtask functionality.
API:
- Updates createClineAPI() in index.ts to use removeClineFromStack() for starting new tasks.

^{This description was created by}^{for c3c3874. It will automatically update as commits are pushed.}

changeset-bot · 2025-02-22T22:05:03Z

⚠️ No Changeset found

Latest commit: b46da7b

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

shaybc · 2025-02-23T19:36:21Z

i have found the failing test, it uses the private provider.cline member that does not exist anymore, it was a direct access to a private member of ClineProvider, this is a bad practice - even for test,

i have recreated temporary the cline member as public - just for doing a Find all references, and it found it only in the ClineProvider.test - i will fix it

src/core/webview/ClineProvider.ts

src/core/Cline.ts

src/core/webview/ClineProvider.ts

src/core/Cline.ts

src/core/webview/ClineProvider.ts

shaybc · 2025-02-24T17:39:31Z

i am currently working on 2 last tweaks:

restoring mode of each task or subtask that was originally launched with, so for example if Task1 launches Task2:
T1(Architect)
T1(Architect) -> T2(Code)
T1(Architect) -> T2(Code) -> T3(QA)
when they are finished in current code it will be like this:
T1(Architect) -> T2(Code) -> T3(QA)
T1(Architect) -> T2(QA)
T1(QA)

and it should be:
T1(Architect) -> T2(Code) -> T3(QA)
T1(Architect) -> T2(Code)
T1(Architect)

adding a Task number for each Task and subtask, so a user can at least identify each execution sequential No and order, for this i am adding for each task a task_metafata.json file

…_add_subtasks

…setTaskNumber mock

…tTaskNumber, getTaskNumber)

KJ7LNW · 2025-03-05T04:30:28Z

Upon reviewing the code it appears that there are no new system instructions, that I read that correctly?

It looks like your code he is simple in design: a stack of tasks is generated, and the new code hints in two to the parent task context with the completion message from the child task context (?) to let the parent proceed?

I like that there is no involvement with system instructions, the behavior appears automatic. This also means that if there is a bug, it is probably only possible to interfere with users that use new_task feature.

(FYI: I was able to get it to merge and tweaked the conflicting line I showed above, hopefully it will not affect anything. still, I think you will need a rebase or some merge magic to clean it up.)

KJ7LNW · 2025-03-05T05:01:31Z

This is pretty cool, I have tried to crash it with some recursive top level instructions, but I couldn't...it performed great.

These worked in sonnet-3.5, just paste them and create a task. It is fun if you enable "auto Modes", so you do not have to click approve...and then just watch it fly by:

Fibonacci sequence:

(prev=1, curr=1, depth=4): output curr and if depth-- > 0 create a `new_task` with prev=curr and curr=prev+curr to generate fibonacci sequence

Pascal's triangle:

(row=[1], depth=4): output current row and if depth-- > 0 create a `new_task` with next row calculated by adding 1 at edges and summing adjacent numbers

Recursive binary search:

(N=, depth=2) Output the node value N. Then, if depth > 0: 1. Create a left child `new_task` with a randomly chosen value that is LESS than the current node value N. Create a right child `new_task` with a randomly chosen value that is GREATER than the current node value. Instruct each child task, to do the same and decrement depth for each call. each child must show their knowledge of the tree and in mermaid diagram with smaller values on the left and larger values on the right at each level, and then `attempt_completion` with the tree they are aware of.

I think my instructions above are working, and I wanted to go inspect each step of the binary tree invocations, but there are just too many tasks in my history to inspect in a way that I understand the invocation order. These binary search instructions are probably working, but who knows. At least the final result looks right...

So, is new_task Turing complete? Time will tell...

Suggestions, issues:

if AI attempts to be create a task, and instead of clicking "Accept" I respond with a comment, the task locks up and I have to close the task and reopen it.
It would be really nice to be able to go to any parent task and click on the invocation of a child task to take you there easily, especially in tree fanouts or other complex task hierarchies.
Switching back to parent task should require an approval unless "Auto-approve: Switch modes & create tasks" is checked
clicking X to close the child task took me back to the parent task and immediately started the parent task executing, but I really was not ready for that. I closed the child task because I needed to check something in another task. I am not sure if the workflow should be that closing a child task takes you to the parent task, but this could be annoying for deeply nested tasks. for simplicity of existing workflow I think we should leave it so that when you close any task it takes you to the task history like normal, and then you can navigate to whatever you want.
- for future work it could be nice to see a tree of tasks in the task history to maintain grouping of related tasks by call stack instead of by "most recently modified task" as it sorts by default now.
- another analogy would be to add a history sort bubble that sorts by "thread" like some email programs

shaybc · 2025-03-05T09:23:47Z

Great work on this - it's really exciting to see it work!

I do feel a little nervous though about the implications of using a stack, keeping the parent waiting in a loop, etc, and want to test it out a little more before we ship this.

The other thing I'm wondering about is whether it would be worth spending a little more time on the approach where we only have one Cline active at a time and we reload them from history instead of keeping them in memory.

shaybc · 2025-03-05T10:01:35Z

Great work on this - it's really exciting to see it work!

I do feel a little nervous though about the implications of using a stack, keeping the parent waiting in a loop, etc, and want to test it out a little more before we ship this.

The other thing I'm wondering about is whether it would be worth spending a little more time on the approach where we only have one Cline active at a time and we reload them from history instead of keeping them in memory.

the way i see it sub-tasks is a gateway feature to some advanced future features that requires in memory interactions, the reasons i did not chose to reload from history are:

the number 1 reason is: for future plan to allow sub-tasks to run in parallel where applicable
another future feature can be where the user state max tokens or price to use for this whole task, and if the parent is alive it can monitor the sub-task token usage or price and decide weather it is within its allocated quota, pause the sub-task, get user approve to allocate extra budget and resume the sub-task / sub-tasks
the ability for sub-task to intercommunicate with its parent task and ask it to do something else or to give it more data (like previous sub-tasks results), ask it to relaunch it again with extra text in the prompt or just for retry purposes etc.
another future feature is the ability to declare a timeout on the parent task if required (in case sub-tasks takes to long to process)
and a stack will be required anyway in that approach as well since the order of the tasks is not guaranteed to be in the order they are saved, since if a flow of task launch will look like this: T1 -> T2 return to T1 -> T3 then the history order would be: T1 -> T2 -> T3
side note: a sub-task has the ability to be launched with a certain mode, we need to also add the ability to declare the model we wish to use in that specific task, and

i do not see a downside to keeping a task running waiting on a promise for sub-task to end, the ai is not impacted, the memory consumption is negligible, the CPU is not loading due to several promise activated, and disk is continuously updated with changes the task is performing the second they occur (task does not wait to the end to persist)

shaybc · 2025-03-05T11:21:13Z

I am trying to test this by merging into my Roo origin/main build with several other branches that I use---but at the moment I cannot even get it to merge into origin/main:

Can you rebase this on main, so I can try it out without hand-patching?
...

i have rebased and merged from the latest main to resolve conflicts (i hate manual merges - it is because these Cline and ClineProvider are stupidly huge)

shaybc · 2025-03-05T11:28:29Z

Upon reviewing the code it appears that there are no new system instructions, that I read that correctly?

It looks like your code he is simple in design: a stack of tasks is generated, and the new code hints in two to the parent task context with the completion message from the child task context (?) to let the parent proceed?

I like that there is no involvement with system instructions, the behavior appears automatic. This also means that if there is a bug, it is probably only possible to interfere with users that use new_task feature.

(FYI: I was able to get it to merge and tweaked the conflicting line I showed above, hopefully it will not affect anything. still, I think you will need a rebase or some merge magic to clean it up.)

yes, there is no system_prompt changes, but there might be a need to add some, currently it is up to the user or model to call new_task, once it is done the control is returned to the main task, also if the user wants the model to leverage the power of the feature, he can prompt something like this:

perform each of the following subtasks only using the new_task tool:

subtasks:
1. ...
2. ...

shaybc · 2025-03-05T11:43:50Z

for future work it could be nice to see a tree of tasks in the task history to maintain grouping of related tasks by call stack instead of by "most recently modified task" as it sorts by default now.

i agree with all the suggested improvements, but i do think a first release should occur, allowing users to try and submit bug reports and requirements (before we go dreaming what is required when the users might direct us to a completely different direction)

the visualization and ability to see the task execution order, unique name for each sub-task (not just a number), the ability to break execution and resume from the appropriate sub-task ... all important parts that will need to be in a followup PRs rather then a huge change (IMO)

regarding canceling sub-task, or deleting sub-task returning execution to the parent task - that is a planned and developed feature, the first behavior was exactly like you described (sub-task interrupted caused the sub-task to stop and thats it, it did not returned control to the parent, and there was no way to resume the task)

i think we should ask the user what he wants to happen (resume parent task and tell it that the sub-task failed ? or exit the task execution and return to history view), since each intend a different thing when canceling a sub-task

shaybc · 2025-03-05T11:47:10Z

This is pretty cool, I have tried to crash it with some recursive top level instructions, but I couldn't...it performed great.

These worked in sonnet-3.5, just paste them and create a task. It is fun if you enable "auto Modes", so you do not have to click approve...and then just watch it fly by:

another benefit from this architecture is to reduce the context window size for each task (by not handling it all in a single context) and by that allowing even small models (local ollama models) to perform like the big ones.

src/core/webview/ClineProvider.ts

KJ7LNW · 2025-03-05T18:46:43Z

@shaybc please cc me or reference this PR# in your next PR or discussion or issue opened on this subject. this is a really great feature and I want to keep up on the neat things that you are doing with it .

subtasks alpha version (still in development)

c3c3874

shaybc requested review from ColemanRoo, cte, mrubens and stea9499 as code owners February 22, 2025 22:05

dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. enhancement New feature or request Issue - In Progress Someone is actively working on this. Should link to a PR soon. labels Feb 22, 2025

ShayBC and others added 2 commits February 23, 2025 05:15

pass last message of a subtask to parent task

87f6ac4

Change response to be a user message

20f9073

added getClineStackSize() to ClineProvider and fixed its tests

655930c