-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Description
What specific problem does this solve?
User Story
As an engineer using roocode, I want the tool to learn from my decisions in response to its questions, so that it create a personalized experience that resolves ambiguities based on my established preferences, significantly accelerating problem-solving, and so that over time, this builds the trust required for me to confidently enable auto-approval, knowing the tool's actions will align with my own.
User Impact
This feature directly addresses a primary barrier to full automation: lack of trust. Many users who would benefit from auto-approval hesitate to enable it because they can't predict the tool's decisions.
By learning from user feedback, roocode will evolve from a generic tool into a personalized assistant. This builds the necessary confidence for users to enable auto-approval for a wider range of actions. The result is a more autonomous and reliable workflow, where roocode can handle more complex tasks with less supervision, ensuring the final output more closely matches the user's intent from the start.
Additional context (optional)
No response
Roo Code Task Links (Optional)
https://app.roocode.com/share/12c52b25-106a-4ceb-a868-43b715e2aa8f
Request checklist
- I've searched existing Issues and Discussions for duplicates
- This describes a specific problem with clear impact and context
Interested in implementing this?
- Yes, I'd like to help implement this feature
Implementation requirements
- I understand this needs approval before implementation begins
How should this be solved? (REQUIRED if contributing, optional otherwise)
Given we have an integration already into a vector database (qdrant) a lot of the heavy lifting has been done
High Level Plan
to make this focussed, I have detailed a high level plan that I plan to execute for implementing this feature.
Each step will be a single pull request where applicable.
- qdrant client update to dynamically accept different collection types (as an enum or equilivant). Dependencies will be updated too. My plan is to scope this to just repo based. With the idea we still keep the constructor the same, but for the other functions in the client a mandatory "collectionType" param needs to be passed
- store the memories via
askFollowupQuestionTool.ts, including a setting which is disabled by default which determines if it goes the code path to store a memory - Simple approach: add checkbox under experimental features. Medium Complexity: reuse the codebase indexing UI element with tabs for settings on there. Most Complexity: add new setting section in settings for vector database integration to setup the db, and have the settings for both codebase indexing and memory storage there. The existing code base index icon can stay for starting / clearing indexes, etc.
- create two new tools called
create-followup-questionandgenerate-followup-question-suggestionsand update everything required, this should work as before. Wheregenerate-followup-question-suggestionswill still handle structuring the output for the React component & storing the decision in the qdrant store. These tools will only get registered if the setting is enabled. - update
create-followup-questionto retrieve from the vector database for similar questions, and what related answers are, and it will be fed it to thegenerate-followup-question-suggestionstool so that the LLM can know what the user might prefer as potential solutions when generating solutions. - Beta release
- General release (open to moving this if we want to address memory degradation issues first)
Future Plan
I would like to keep developing this as concerns of memory degradation have been brought up in a similar feature request. Also I would like to expand the functionalities of this feature so that it can work better over time.
Non Prioritize List for Future Implementation
- implement metrics on usefulness of the feature to understand how to fine tune and modify the retrieval, prompt, and other mechanisms
- enhanced memory retrieval & storage; utility scoring, prevent memory degredation, etc...
- global decision memory; this can be nuanced, but if we do this, we'd need to store additional metadata about the repo such as language, framework, etc in the embedded data, so only contextual data
- "learning" mode which asks the users much more frequently (maybe all the time?) to learn their behaviour. This can also be used to reset the stored memory
- modify the suggestions prompt so it gives many more options when in learning mode
- when executing a task which has a similar question stored, then look at what the preferred user behaviour is
- store memories based on tool output for more efficient tool usage
- support different DISTANCE_METRICs for retrieval
- support a shared db across developers
Below is a plan with a bit more details and specifics as generated by roo
decision_memory_implementation_plan.md
How will we know it works? (Acceptance Criteria - REQUIRED if contributing, optional otherwise)
Given a question needs to be asked to the user
And this feature is enabled
When a question is asked to the user
Then propose answers which will be closer to what the user would have answered based on the question
Given I want to enable question suggestion memory
When I go to the roo code UI
Then I can enable this feature
Given a question needs to be asked to the user
And this feature is disabled
When a question is asked to the user
Then propose suggested answers in the same way as it was before
Technical considerations (REQUIRED if contributing, optional otherwise)
see document
decision_memory_technical_considerations.md
Trade-offs and risks (REQUIRED if contributing, optional otherwise)
Risk 1: Question tool regression
I believe there is a risk that the question tool won't work well anymore since in my propose it is a two step process whereas before it was a 1 step process. Which can be mitigated with astute prompts and testing.
Alternative Approach 1: Tool calls LLM
As an alternative, we can have the ask_follow_up_question_tool call an LLM to generate the suggestions where the related prompt would be changed to only generate a question. This would reduce costs for doing the suggestions by for example, only sending the last 10 messages for example to generate the suggestions.
I am unsure about this alternative because of how tools should behave I imagine they don't call LLMs, but unless this is okay, this would be my preferred approach
Alternative Approach 2: Context-Aware Prompting Pattern
This solution would have two tool calls as well, one would be to use the search tool, and the other to call a ask_and_suggest tool which based on the prompt it will generate the question and suggestions. The downside here is that since we don't know the question yet it will be harder to find related answers from the user.
Trade-Off: Questions now require two tools being used
Now that we are going to have two tools for a question, there will be more time until a response is returned.
Mitigation Strategy
We need to do the following:
- continuing developing the feature to make it more reliable
- create a setting to disable this feature and enable backwards compatibility - I've added an md file on the alternatives for this that roo came up with
feature_toggle_architectures.md
Risk 2: Qdrant Client Refactor
There's a risk of regression in code base indexing.
Mitigation Strategy
To mitigate this, we can ensure that all tests run successfully, and ensure live testing works as well.
Risk 3: Long-term memory systems risk accumulating outdated or incorrect assumptions, leading to degraded accuracy over time
Given that this risk only happens over time, I think we can take a phased approach enabling memory first, then solving this problem. I am open to this being a blocker for general release though.
Options
- implement a time decay by storing a "freshness score"
- periodic revalidation if a memory is older than some time period so the user can confirm if that assumption is still correct, this can be a setting
- create a memory management UI interface so users can modify, or delete memories
Risk 4: Performance Issues and Storage Bloat
I am unfamiliar with this problem space, but if we see this to be an issue I can look into the options in more detail
Options
- Implement Asynchronous and Batch Indexing
- Tune Approximate Nearest Neighbor (ANN) Parameters
- Use Dimensionality Reduction and Vector Quantization
Problem 1: UI Design
I am unsure about how to implement this, so I have two options
- add tabs to the codebase index window
- create a settings tab in the setting section for configuring code base indexing & question memory & vector DB setup, while keeping the existing code base index icon / window for index status & starting / clearing the index
Feedback on this would be appreciated
Metadata
Metadata
Assignees
Labels
Type
Projects
Status