-
Notifications
You must be signed in to change notification settings - Fork 2.6k
feat: add GPT-OSS tool-calling support for OpenRouter #6816
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Add support for handling tool calls in OpenRouter streaming responses - Handle tool calls within reasoning/thinking blocks for GPT-OSS models - Add ApiStreamToolCallChunk type to stream definitions - Add comprehensive tests for GPT-OSS tool-calling scenarios Fixes #6814
| arguments: argsMatch[1], | ||
| } | ||
| } | ||
| } catch (e) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider refining error handling for tool call parsing: if regex parsing fails, yielding the entire reasoning text may duplicate output. It might be better to yield only the unmatched portion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewing my own code is like debugging in production - technically possible but morally questionable.
| } catch (e) { | ||
| console.warn("Failed to parse tool call from reasoning:", e) | ||
| // If parsing fails, treat it as regular reasoning text | ||
| yield { type: "reasoning", text: delta.reasoning } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this error handling approach intentional? When tool call parsing fails, we're yielding the entire delta.reasoning as reasoning text. However, the matcher has already processed part of this content. This could lead to duplicate content in the stream.
| // Emit a tool call chunk | ||
| yield { | ||
| type: "tool_call" as const, | ||
| id: `tool_${++toolCallIdCounter}`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tool call ID generation using a simple counter could potentially cause conflicts if multiple streams are processed simultaneously. Consider using a more unique identifier?
Or even better, use a proper UUID generator if available in the codebase.
| } | ||
|
|
||
| // Emit any remaining accumulated tool call | ||
| if (currentToolCall && currentToolCall.id && currentToolCall.name && currentToolCall.arguments) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if a tool call is incomplete (missing name or arguments)? Currently we silently skip it, but should we log a warning to help with debugging?
| await expect(handler.completePrompt("test prompt")).rejects.toThrow("Unexpected error") | ||
| }) | ||
| }) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great test coverage! Consider adding a test case for tool calls that are split across multiple streaming chunks. This would ensure the accumulation logic in lines 230-237 of the main file works correctly when tool call data arrives fragmented.
| text: string | ||
| } | ||
|
|
||
| export interface ApiStreamToolCallChunk { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice addition of the ApiStreamToolCallChunk type! The structure is clean and follows the existing pattern well.
|
Does this fix tool calling just for this model when used through openrouter? Or will this also apply to gpt-oss served locally through llama.cpp? |
|
Hopefully this fix works across all calling of gpt-oss |
|
@roomote-agent will this also apply to gpt-oss served locally through llama.cpp? |
|
Hi @nullnuller! I see your question about whether this GPT-OSS tool-calling support will work with gpt-oss served locally through llama.cpp. Let me investigate the implementation to give you a detailed answer. |
|
Great question! After investigating the implementation, I need to clarify how this works: Short answer: No, this PR's changes are specific to the OpenRouter provider and won't directly apply to GPT-OSS models served locally through llama.cpp. Detailed explanation:
What you would need for local GPT-OSS support: To get the same tool-calling support for locally-served GPT-OSS models, you would need to:
Workaround: If you want to use GPT-OSS with tool-calling support today, you could:
Would you like me to create a follow-up issue to track adding GPT-OSS tool-calling support for local servers? |
yes, please open a new issue for the local GPT-OSS setup using llama.cpp server. |
|
Closing, the issue needs to be properly scoped. Support for tool calling within the reasoning response should be implemented for all providers. |
|
@daniel-lxs is there an open issue to implement tool calling within the reasoning response? I tried searching for it but was unable to find it. I am running gpt-oss-120b through llamacpp and while regular tool calling works, tool calls within reasoning does not. |
Just got GPT‑OSS:20b running with llama.cpp and tried it with roo. I get the same errors when tool calls are made during reasoning.
I am running the latest available llama.cpp version from the repository: This version now supports the OpenAI Harmony syntax correctly and therefore produces ChatML output instead of Harmony output when the parameter |
|
@semidark give these instructions a shot with llama.cpp: ggml-org/llama.cpp#15396 (comment) |

This PR adds support for OpenAI's GPT-OSS models (gpt-oss-20b and gpt-oss-120b) tool-calling format in the OpenRouter provider.
Changes
ApiStreamToolCallChunktype to stream definitionsProblem
The GPT-OSS models have a different tool-calling format that was not properly supported in Roo Code, causing tool calls and MCP server calls to fail frequently when using these models through OpenRouter.
Solution
Testing
Fixes #6814
Important
Adds GPT-OSS tool-calling support to OpenRouter, handling tool calls within reasoning blocks and updating stream handling.
OpenRouterHandler.XmlMatcherto parse tool calls within reasoning blocks.ApiStreamToolCallChunktostream.tsfor handling tool call data.openrouter.spec.tsfor various GPT-OSS tool-calling scenarios.This description was created by
for f4a0fc3. You can customize this summary. It will automatically update as commits are pushed.