Skip to content

RFC: Native Tool Use for Top-Tier AI ModelsΒ #4047

@moqimoqidea

Description

@moqimoqidea

What problem does this proposed feature solve?

Currently, Roo Code uses XML tags for tool calling across all AI models, including top-tier models like Claude that have native tool calling capabilities. This approach results in:

  • High Failure Rate: Approximately 10% of tool calls fail when using XML tag-based tool calling with top-tier models
  • Increased Complexity Failures: Functions like apply_diff have significantly higher failure rates (>15%)
  • Multi-Turn Degradation: In agent mode with consecutive tool calls across multi-turn conversations, reliability decreases progressively
  • Suboptimal Performance: XML parsing by models not designed for it introduces latency and accuracy issues
  • Inconsistent User Experience: Users experience unpredictable tool behavior, particularly when editing large files

Quantitative data shows that XML tag-based approaches are less reliable than native tool calling implementations that top-tier models have specifically optimized for their architectures.

Describe the proposed solution in detail

Implement a tiered tool calling system that prioritizes native tool calling APIs for models that support them, while maintaining backward compatibility through XML tags for models that don't.

Key functionalities:

  1. Provider/Model Detection: Automatically identify the model provider and specific model ID during runtime

  2. Native Tool Routing: Route tool calls through native APIs when available:

    • Use Claude's native tool calling API for Claude 3.5/Sonnet/Opus models
    • Use OpenAI's function calling for GPT models
    • Use Gemini's function calling for Gemini models
  3. Transparent Translation Layer: Create a unified interface that handles the appropriate method selection:

    // Example API (simplified)
    async function executeTool(toolName, params, modelProvider, modelId) {
      if (supportsNativeToolCalling(modelProvider, modelId)) {
        return executeNativeTool(toolName, params, modelProvider, modelId);
      } else {
        return executeXmlTagTool(toolName, params);
      }
    }
  4. Tool Mapping System: Implement mappings between Roo Code tools and provider-specific tool formats:

    Functionality Anthropic Claude Roo Code Current
    Read File view: path, view_range read_file: path, start_line, end_line
    Read Directory view: path list_files: path, recursive
    Code Replacement str_replace: path, old_str, new_str apply_diff: path, diff
    New File Creation create: path, file_text write_to_file: path, content, line_count
    Code Insertion insert: path, insert_line, new_str insert_content: path, line, content
  5. Progressive Rollout: Implement the feature in three phases:

    • Phase 1: Add native tool support for Claude models
    • Phase 2: Expand to other top-tier providers (Gemini, GPT)
    • Phase 3: XML tags are only used as a fallback for models that don't support tool use

Technical considerations or implementation details (optional)

  1. Abstraction Layer Architecture:

    • Create a new ToolExecutionStrategy interface with model-specific implementations
    • Implement a ToolExecutionFactory that selects the appropriate strategy based on model provider and ID
    • Maintain the current XML tag processor as a fallback strategy
  2. Parameter Translation:

    • Build a bidirectional mapping system between Roo Code parameters and native tool parameters
    • For complex operations like apply_diff, we need specialized translation logic:
      // Example translation for apply_diff to Claude's str_replace
      function translateApplyDiffToStrReplace(path, diff) {
        const { oldStr, newStr } = parseDiff(diff);
        return { tool: "str_replace", params: { path, old_str: oldStr, new_str: newStr } };
      }
  3. Error Handling and Retries:

    • Implement intelligent fallback: if a native tool call fails, attempt XML format as backup
    • Add telemetry to track success rates of different approaches (with user permission)
    • Create specialized error types for better debugging
  4. Required Dependencies:

    • Updated client libraries for each provider's API
    • Structured response parsers for each tool call format
  5. Implementation Phases:

    • Phase 1: Claude integration
    • Phase 2: GPT and Gemini integration
    • Phase 3: Optimization and fallback mechanism refinement

Describe alternatives considered (if any)

  1. Enhanced XML Tag Processing:

    • Could improve the current XML tag approach with better formatting and context
    • Would still have fundamental limitations since models aren't optimized for XML parsing
    • Rejected because it wouldn't address the root cause of failures
  2. Custom Intermediary Format:

    • Could create a new intermediate format specifically designed for AI models
    • Would require significant research to optimize
    • Rejected due to high development cost and lack of clear advantage over native tools
  3. Model-Specific Prompting:

    • Could use tailored prompts for each model instead of changing the tool calling method
    • Tests showed only marginal improvements (2-3% reduction in failures)
    • Rejected because native tools provide much greater reliability improvements (>90%)
  4. Hybrid XML/JSON Approach:

    • Using JSON for structured data within XML tags
    • Complexity outweighed benefits in testing
    • Rejected because it adds complexity without addressing fundamental model capabilities

Additional Context & Mockups

Industry Evidence

  1. Cursor Team Research (Lex Fridman Interview):
    From the YouTube interview: Apply Part:

    "You see shallow copies of apply elsewhere and it just breaks most of the time because you think you can try to do some deterministic matching and then it fails at least 40% of the time and that just results in a terrible product experience."

  2. GitHub Copilot's Implementation (May 2025):
    From the VSCode v1.100 update: Faster agent mode edits:

    "We've implemented support for OpenAI's apply patch editing format (GPT 4.1 and o4-mini) and Anthropic's replace string tool (Claude Sonnet 3.7 and 3.5) in agent mode. This means that you benefit from significantly faster edits, especially in large files."

  3. VSCode's AI Strategy:
    From the May 2025 blog post:

    "We will open source the code in the GitHub Copilot Chat extension under the MIT license... This is the next and logical step for us in making VS Code an open source AI editor."
    This will have an impact on other AI coding tools, whether they are open source or not. In short, it raises the average baseline for open source products, so hopefully Roo Code will see this and optimize especially the weak points in the agent mode or the frustration of users.

Technical Documentation

Claude Text Editor Tool Curl Demo

curl https://api.anthropic.com/v1/messages \
  -H "content-type: application/json" \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -d '{
    "model": "claude-opus-4-20250514",
    "max_tokens": 1024,
    "tools": [
      {
        "type": "text_editor_20250429",
        "name": "str_replace_based_edit_tool"
      }
    ],
    "messages": [
      {
        "role": "user",
        "content": "There'\''s a syntax error in my primes.py file. Can you help me fix it?"
      }
    ]
  }'

Proposal Checklist

  • I have searched existing Issues and Discussions to ensure this proposal is not a duplicate.
  • This proposal is for a specific, actionable change intended for implementation (not a general idea).
  • I understand that this proposal requires review and approval before any development work begins.

Are you interested in implementing this feature if approved?

  • Yes, I would like to contribute to implementing this feature.

Metadata

Metadata

Assignees

Labels

Issue - In ProgressSomeone is actively working on this. Should link to a PR soon.enhancementNew feature or request

Type

No type

Projects

Status

Issue [In Progress]

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions