Feature Proposal: Integrate Optional Autonomous Agent Mode using AutoGen #1999

aeehliver · 2025-03-26T08:04:46Z

aeehliver
Mar 26, 2025

Feature Proposal: Integrate Optional Autonomous Agent Mode using AutoGen

Hi Roo Code Team,

First off, thank you for creating Roo Code! It's an incredibly powerful and versatile VSCode extension for AI-assisted development, and the extensive LLM provider support and integrated tools are fantastic.

I'm opening this issue to propose a significant new feature: adding an optional "Agent Mode" powered by a multi-agent framework like Microsoft AutoGen. This aims to address some inherent limitations of the current single-assistant interaction model, especially for complex, multi-step tasks requiring higher autonomy, planning, and self-correction capabilities.

Motivation & Current Limitations

While Roo Code's current "Classic Mode" (using a single LLM with tools defined via XML, selectable via modes like edit-code, generate-tests, etc.) is excellent for many tasks, users encounter challenges with more complex workflows:

Context Limitations: Even with large context windows and techniques like using .roo/context files, the single LLM can lose track of the overall goal, specific constraints, or previous steps in long-running tasks or large projects.
Limited Autonomy & Planning: The current model primarily reacts to prompts. While modes like planning exist, the actual execution often requires manual stepping or struggles with tasks requiring proactive planning, dynamic task decomposition based on intermediate results, and autonomous decision-making (e.g., based on test outcomes).
Brittle Testing & Debugging: While Roo Code can execute commands (like curl or test runners via the <tool_code>execute-command</tool_code>), the AI often struggles to autonomously interpret complex test failures, debug the root cause, and implement correct fixes without significant human guidance. It might generate superficial tests or simply report the failure and ask the user for help instead of proceeding independently. The feedback loop is often manual or relies on the LLM correctly figuring out the entire multi-step correction process in one go.
User Intervention: For complex features, the user often needs to constantly guide the AI, break down tasks manually, and verify intermediate steps, reducing the "autopilot" potential.

Proposed Solution: Optional AutoGen-Powered "Agent Mode"

I propose integrating an optional mode that leverages a multi-agent system built with AutoGen (a popular Python framework). This would coexist with the current "Classic Mode".

High-Level Architecture:

Mode Selection: The user could choose between "Classic Mode" and the new "Agent Mode" via the existing UI mechanisms (e.g., the mode dropdown).
VSCode Extension (TypeScript - Roo Code):
- Continues to provide the excellent React-based UI.
- Manages mode selection.
- Acts as a bridge when "Agent Mode" is active:
  - Sends the user's high-level goal to a local Python backend.
  - Receives a real-time stream of updates (agent messages, logs) from the backend via WebSockets/SSE to display in the chat.
  - Crucially: Hosts a simple local HTTP server to receive tool execution requests from the Python backend. This allows AutoGen agents (in Python) to leverage Roo Code's existing, robust TypeScript services (src/services/terminal, src/utils/fs.ts, src/services/ripgrep, src/services/tree-sitter, etc.).
Python Backend (New Component):
- A separate, locally running process (likely managed by the user initially, installed via pip).
- Built with FastAPI (for REST/WebSocket endpoints) and AutoGen.
- Receives tasks from the TS extension.
- Instantiates and orchestrates multiple AutoGen agents (e.g., PlannerAgent, CoderAgent, TestGeneratorAgent, TestExecutorAgent, DebuggerAgent).
- Manages the agent conversation and task state across multiple steps.
- Streams updates back to the TS extension.
- Calls the TS extension's tool server endpoint whenever an agent needs to execute a tool (terminal command, read/write file, search, etc.).

Integration with Existing Mode System:

From a user experience perspective, this "Agent Mode" could be seamlessly integrated as an additional option within Roo Code's existing mode selection framework (defined in src/shared/modes.ts). Users could simply choose "Agent Mode" for tasks requiring higher autonomy.

However, the backend implementation for this mode would differ significantly. While existing modes primarily function by tailoring the system prompt and available XML tools for the single LLM assistant, activating "Agent Mode" would engage the external Python-based AutoGen orchestration engine via the proposed communication bridge. This allows introducing true multi-agent capabilities while maintaining a consistent user experience for mode selection.

Anticipating the Question: How is this different from current tool use and modes?

We acknowledge Roo Code's sophisticated existing tool integration (XML-based) and different operational modes. These are powerful for enabling the LLM to interact with the environment.

However, the current mechanism relies on the LLM itself to correctly sequence, invoke, and interpret the results of these tools within the flow of a single conversational turn or a manually guided sequence. The proposed Agent Mode using AutoGen externalizes the orchestration logic.

In Agent Mode, specialized Python agents manage the workflow, maintain persistent state across multiple steps (potentially involving dozens of tool calls and LLM inferences), and make structured, autonomous decisions based on tool outputs (like detailed test results). This provides a more robust framework for true autonomy in complex tasks like end-to-end feature implementation with self-correction, which is qualitatively different from instructing a single LLM to use tools sequentially, even within specific modes like planning.

Example Improvement: Autonomous Autotesting Loop

Current Roo Code: Can run tests via execute-command. If tests fail, the LLM might try a simple fix or ask the user for guidance. The loop often requires manual intervention.
Proposed AutoGen Mode: A DebuggerAgent receives structured test failure reports from a TestExecutorAgent. It can then autonomously decide to use tools (via the TS bridge) to read files (read-file), analyze code (tree-sitter?), determine a fix, instruct a CoderAgent or apply the fix directly (write-to-file), and trigger a re-test via the TestExecutorAgent. This autonomous test-debug-fix cycle is a key advantage.

Benefits for Roo Code

Enhanced Capabilities: Addresses complex tasks and workflows currently difficult to manage.
Increased Autonomy: Moves closer to a true "autopilot" experience for advanced use cases.
Leverages Existing Strengths: Reuses the excellent UI, LLM provider support, and crucially, the powerful TypeScript-based tools (src/services/) by making them accessible to the AutoGen agents via the callback bridge.
Flexibility: Users can choose the best mode (Classic vs. Agent) for their specific needs.
Future-Proofing: Aligns Roo Code with the growing trend of agent-based AI development tools.

Implementation Considerations

The main technical challenge is building the robust, bidirectional communication bridge (TS <-> Python), especially the Python -> TS tool execution callback.
Initially, users would need Python installed (pip install pyautogen fastapi uvicorn websockets requests python-dotenv) and would have to manually start the local Python backend process. Documentation would be key.
This adds a Python dependency and runtime component, increasing complexity compared to the pure TS extension, but offers significant power for users who need it.

Call to Action

We believe this optional "Agent Mode" integration would be a significant enhancement for Roo Code, unlocking new levels of capability and autonomy for complex development tasks.

What are your initial thoughts on this proposal?
Would you be open to considering/reviewing a Pull Request that implements this feature (acknowledging it would be substantial)?
Do you have any concerns or alternative suggestions for achieving similar goals within Roo Code's architecture?

Thank you for considering this proposal and for your amazing work on Roo Code!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature Proposal: Integrate Optional Autonomous Agent Mode using AutoGen #1999

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Feature Proposal: Integrate Optional Autonomous Agent Mode using AutoGen #1999

Uh oh!

aeehliver Mar 26, 2025

Feature Proposal: Integrate Optional Autonomous Agent Mode using AutoGen

Motivation & Current Limitations

Proposed Solution: Optional AutoGen-Powered "Agent Mode"

Replies: 0 comments

aeehliver
Mar 26, 2025