Skip to content

Conversation

NaccOll
Copy link

@NaccOll NaccOll commented Aug 6, 2025

Issue

#4047 (comment)

What I Do

Without major refactoring of the existing workflow, the tool call based on XML-formatted text prompts was converted to tool call based on json.

Implementation Approach

  • Added a toggle in the Provider settings page, allowing users to enable/disable tool call
  • Designed a compatible tool call interface and migrated tools from text+XML prompts to tool call format objects
  • When tool call is enabled, converted tools will remove their original descriptions and examples during prompt generation
  • Designed unified post-call processing to transform the returned standard JSON into XML, seamlessly integrating with the existing function

Tools

  • apply_diff
  • insert_contnet
  • list_code_definition_names
  • list_files
  • read_file( The file read limit may fail. )
  • search_and_replace
  • search_files
  • write_to_file
  • access_mcp_resource
  • ask_followup_question
  • attempt_completion
  • browser_action
  • codebase_search
  • execute_command
  • fetch_instructions
  • new_task
  • switch_mode
  • update_todo_list
  • use_mcp_tool
  • generate_image

Provider

Benefits

  1. When tool information is returned via tool_call, format accuracy improves significantly—even 4B models can generate precisely formatted apply_diff outputs.
  2. Prompt length is reduced.

Problem

Legacy Issues

  1. Accuracy depends on tool call, but models may fail to grasp the intent and still return text instead of tool calls.
  2. Not all tools have been migrated yet, which may lead the model to mimic behavior. As XML content accumulates in the chat, incorrect XML outputs may occur. ( Abandoned the need to refine parameters and strengthen control (for example, splitting the content of apply_diff into search_str and replace_str), and completely aligned the tool call parameters with the original XML description to avoid parameter errors when LLM refers to tool parameters and large amounts of user XML input for XML output. )(This has been resolved through standard tool calls)
  3. Format accuracy does not guarantee content accuracy. While the model can invoke tools correctly. e.g., apply_diff requires content-matching replacements, the generated original content may still be inaccurate.

New Problem

  1. Streaming processing is disrupted. Although results are still returned via streaming, the actual tool invocation requires converting JSON to XML first. For large file modifications, this causes delays where users see no feedback during processing.(Support streaming conversion of json to xml)
  2. Streaming state changes after JSON completion. Due to the current duplicate call detection, the model's retry limit must be adjusted to two or more attempts; otherwise, errors will occur.(See Errors and Repeated Use Limitations #6834)
  3. For a tool, you need to maintain both the XML prompt word and the tool call description object. I originally wanted to map the tool call object to an XML description, but apply_diff and read_file are more complicated. I still need to think about how to complete it. (I am returning the tool's schema and xml prompt in the same function)
  4. When you pass the tools parameter, OpenRouter promises to route your request to providers that support tool calls. However, some providers' tool call implementations aren't flawless. For instance, DeepSeek-V3-0324 on Chunts handles tool calls normally, but Qwen3-Coder's tool functionality may fail. That said, if you select Alibaba as Qwen3-Coder's provider, the tool calls will work fine—though at a cost several times higher. This inevitably complicates tool call support for models on OpenRouter, leaving users to figure out the quirks themselves.
  5. I lack the test conditions for browser_action. In addition, ask_followup_question uses xml attribute to set mode, and cannot map json stream to xml. (feat: simplify attempt_completion tool description #6888)
  6. When context compression is in progress, if the last assistant message is tool use, it will fail.

Screenshots / Videos

image

Important

Introduces JSON-based tool calls with a toggle in settings, updating API handlers, tool schemas, and UI components to support this new functionality.

  • Behavior:
    • Added toolCallEnabled toggle in provider-settings.ts to enable/disable tool calls.
    • Converted tool calls from XML to JSON format in base-provider.ts, lm-studio.ts, openai.ts, and openrouter.ts.
    • Unified post-call processing to convert JSON back to XML.
  • Tools:
    • Added JSON-based tool call support for tools like apply_diff, insert_content, list_code_definition_names, list_files, read_file, search_and_replace, search_files, write_to_file, and others.
    • Implemented ToolRegistry in tool-registry.ts to manage tool schemas.
  • UI:
    • Added ToolCallSettingsControl.tsx to manage tool call settings in the UI.
    • Updated i18n files for new tool call settings descriptions.
  • Misc:
    • Added StreamingToolCallProcessor in tool-call-helper.ts for streaming JSON to XML conversion.
    • Updated Task.ts to handle tool call streaming and processing.

This description was created by Ellipsis for b1984b8. You can customize this summary. It will automatically update as commits are pushed.

@NaccOll NaccOll requested review from mrubens, cte and jr as code owners August 6, 2025 09:21
@NaccOll NaccOll marked this pull request as draft August 6, 2025 09:21
@dosubot dosubot bot added size:XXL This PR changes 1000+ lines, ignoring generated files. enhancement New feature or request labels Aug 6, 2025
Copy link

@roomote roomote bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution! I've reviewed the changes and found some issues that need attention. This PR implements a valuable feature for native tool calling, but there are some critical issues that should be addressed before merging.

@hannesrudolph hannesrudolph added the Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. label Aug 6, 2025
@daniel-lxs daniel-lxs moved this from Triage to PR [Draft / In Progress] in Roo Code Roadmap Aug 7, 2025
@hannesrudolph hannesrudolph added PR - Draft / In Progress and removed Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. labels Aug 7, 2025
@NaccOll NaccOll force-pushed the feature-tool-use branch 4 times, most recently from e711649 to a5df80c Compare August 10, 2025 06:51
@NaccOll
Copy link
Author

NaccOll commented Aug 10, 2025

@roomote-agent
Review this PR again

Copy link

roomote bot commented Aug 10, 2025

I'm starting my review of this PR now. I'll examine the Native Tool Call implementation and provide feedback shortly.

Copy link

roomote bot commented Aug 10, 2025

Review Complete ✅

I've completed my review of the Native Tool Call implementation. Here's my assessment:

Strengths

  1. Well-structured implementation: The conversion from XML to JSON tool calls is cleanly implemented with a proper streaming processor
  2. Good test coverage: The StreamingToolCallProcessor has comprehensive tests covering various scenarios including edge cases
  3. Backward compatibility: The toggle feature allows users to enable/disable tool calls, maintaining compatibility with existing workflows
  4. Provider support: Successfully integrated with OpenRouter, OpenAI Compatible, and DeepSeek providers

📋 Key Components Reviewed

  • StreamingToolCallProcessor - Handles JSON to XML conversion with proper state management
  • ToolRegistry - Manages tool schemas and provides conversion functions
  • ✅ Tool schemas for supported tools (apply_diff, read_file, write_to_file, etc.)
  • ✅ UI integration with ToolCallSettingsControl component
  • ✅ Localization support across all languages

🔍 Observations

  1. Incomplete tool support: ask_followup_question and browser_action are not yet registered in the ToolRegistry (lines 35, 37 commented out)
  2. Special handling for apply_diff: The processor has custom logic for handling apply_diff content transformation (lines 409-430 in tool-call-helper.ts)
  3. Migration strategy: Proper migration added to ensure existing installations get the new setting defaulted to false

Testing Results

  • All tool-call-helper tests pass successfully (13/13)
  • The streaming processor correctly handles:
    • Simple function calls
    • Incremental streaming
    • Nested objects and arrays
    • Multiple tool calls
    • Edge cases and error scenarios

💡 Recommendations

  1. Consider completing support for ask_followup_question and browser_action tools
  2. Document the provider-specific quirks mentioned in the PR description for users
  3. Consider adding integration tests for the full tool call flow with actual API responses

Verdict

The implementation is solid and ready for use. The streaming JSON to XML converter is well-designed, and the feature toggle provides a safe rollout path. The code quality is good with proper error handling and test coverage.

Great work on this feature! 🎉

@NaccOll NaccOll force-pushed the feature-tool-use branch 6 times, most recently from 3dc988a to f7eab93 Compare August 15, 2025 03:13
@NaccOll
Copy link
Author

NaccOll commented Aug 18, 2025

Native tool call support is complete. I've found a way to use standard tool results to interact with LLM without major changes.

Overall, it's less common to encounter multiple rounds of conversations where subsequent responses mimic XML and abandon the return of tool_calls. I've tested the tools I frequently use in my daily work. There are still a few tools I use less frequently.

  1. browser_action: I haven't used or tested them at all.
  2. fetch_instructions: I've completed the tool call, but I haven't completed the entire process.
  3. access_mcp_resource and use_mcp_tool: I use sequentialthinking regularly, but not as frequently as other tools.

For the other tools, I use them frequently in my daily work. Most of the problem have been resolved.

@NaccOll NaccOll force-pushed the feature-tool-use branch 4 times, most recently from 75d42c1 to 7b7c6f0 Compare August 19, 2025 11:04
@daniel-lxs daniel-lxs marked this pull request as ready for review August 19, 2025 21:02
@daniel-lxs daniel-lxs moved this from PR [Draft / In Progress] to PR [Needs Prelim Review] in Roo Code Roadmap Aug 21, 2025
@hannesrudolph hannesrudolph moved this from PR [Needs Prelim Review] to PR [Draft / In Progress] in Roo Code Roadmap Aug 21, 2025
@daniel-lxs daniel-lxs marked this pull request as draft August 21, 2025 19:05
@hannesrudolph
Copy link
Collaborator

Update on this (From @cte): there's a bug in the tool calling that causes a bunch of the evals to fail, so not quite ready to evaluate. Here's what the errors look like:

An assistant message with 'tool_calls' must be followed by tool messages responding to each 'tool_call_id'. The following tool_call_ids did not have response messages: call_F8hwFSLeOwtYsiX31wdZr9lP

@NaccOll
Copy link
Author

NaccOll commented Aug 22, 2025

Update on this (From @cte): there's a bug in the tool calling that causes a bunch of the evals to fail, so not quite ready to evaluate. Here's what the errors look like:

An assistant message with 'tool_calls' must be followed by tool messages responding to each 'tool_call_id'. The following tool_call_ids did not have response messages: call_F8hwFSLeOwtYsiX31wdZr9lP

This involves returning multiple tools data in a single stream, and I did not handle this situation. This issue has now been fixed.

Due to network issues, I have not yet run the evaluation locally.

I plan to run the evaluation locally, covering models such as GPT-5, GPT-4.1, Gemini-2.5-Pro, Gemini-2.5-Flash, and Deepseek-Chat. Once completed, I will refine the corresponding tests and request a review.

Regarding Claude, I have not yet found a solution to address the mixed text returned by OpenRouter, so this issue may be put on hold for now.

- Introduced a new ToolCallSettingsControl component for enabling/disabling tool calls in the settings UI.
- Updated API options to include tool call settings and integrated it into the existing settings view.
- Enhanced localization files to support new tool call settings in multiple languages.
- Added logic to handle tool call enabled state across different components and tools.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request PR - Draft / In Progress size:XXL This PR changes 1000+ lines, ignoring generated files.
Projects
Status: PR [Draft / In Progress]
Development

Successfully merging this pull request may close these issues.

2 participants