fix: add GLM-4.6 thinking token support for OpenAI-compatible endpoints #8548

roomote · 2025-10-07T11:24:51Z

This PR attempts to address Issue #8547 by enabling thinking tokens for GLM-4.6 when using OpenAI-compatible custom endpoints.

Problem

GLM-4.6 was not generating thinking tokens when used through OpenAI-compatible custom endpoints because the required "thinking": {"type": "enabled"} parameter was missing from requests.

Solution

Added detection for GLM-4.6 model variants (glm-4.6, GLM-4.6, glm-4-6, GLM-4-6)
Include the thinking parameter in requests when GLM-4.6 is detected
Parse thinking tokens using XmlMatcher for <think> tags
Handle reasoning_content field in streaming responses
Added comprehensive tests to verify functionality

Testing

All existing tests pass ✅
New tests added for GLM-4.6 functionality ✅
Tested model detection, parameter addition, and token parsing

Impact

This change only affects GLM-4.6 models when used through OpenAI-compatible endpoints. Other models and providers remain unaffected.

Fixes #8547

Feedback and guidance are welcome!

Important

Adds GLM-4.6 thinking token support for OpenAI-compatible endpoints by detecting models, adding parameters, and parsing tokens.

Behavior:
- Detects GLM-4.6 model variants (glm-4.6, GLM-4.6, glm-4-6, GLM-4-6) in base-openai-compatible-provider.ts.
- Adds thinking: { type: "enabled" } parameter for GLM-4.6 models in createStream().
- Parses thinking tokens using XmlMatcher for <think> tags in createMessage().
- Handles reasoning_content field in streaming responses in createMessage().
Testing:
- Adds tests in base-openai-compatible-provider.spec.ts for model detection, thinking parameter addition, and token parsing.
- Verifies handling of reasoning_content in responses.
Impact:
- Affects only GLM-4.6 models when used through OpenAI-compatible endpoints.

^{This description was created by}^{for aada7cc. You can customize this summary. It will automatically update as commits are pushed.}

- Add detection for GLM-4.6 model variants - Include thinking parameter { type: "enabled" } in requests for GLM-4.6 - Parse thinking tokens using XmlMatcher for <think> tags - Handle reasoning_content in streaming responses - Add comprehensive tests for GLM-4.6 functionality Fixes #8547

ellipsis-dev · 2025-10-07T11:26:40Z

src/api/providers/base-openai-compatible-provider.ts

+	protected isGLM46Model(modelId: string): boolean {
+		// Check for various GLM-4.6 model naming patterns
+		const lowerModel = modelId.toLowerCase()
+		return lowerModel.includes("glm-4.6") || lowerModel.includes("glm-4-6") || lowerModel === "glm-4.6"


The isGLM46Model check has a redundant condition (lowerModel === 'glm-4.6') since includes('glm-4.6') already covers it. Consider removing the extra check.

Suggested change

return lowerModel.includes("glm-4.6") || lowerModel.includes("glm-4-6") || lowerModel === "glm-4.6"

return lowerModel.includes("glm-4.6") || lowerModel.includes("glm-4-6")

roomote

Self-review mode: evaluating my own changes with all the empathy of a linter in a cold data center.

roomote · 2025-10-07T11:35:40Z

src/api/providers/base-openai-compatible-provider.ts

+			}
+
+			// Handle reasoning_content if present (for models that support it directly)
+			if (delta && "reasoning_content" in delta && delta.reasoning_content) {


P2: Potential double-emission of reasoning tokens. For GLM-4.6 you already parse via XmlMatcher; if the provider also populates reasoning_content in the same stream, this emits the same reasoning twice. Gate this branch when GLM-4.6 is active or dedupe.

Suggested change

if (delta && "reasoning_content" in delta && delta.reasoning_content) {

// Handle reasoning_content if present (avoid double-emitting when GLM '<think>' is parsed)

if (!isGLM46 && delta && "reasoning_content" in delta && delta.reasoning_content) {

yield {

type: "reasoning",

text: (delta.reasoning_content as string | undefined) || "",

}

}

roomote · 2025-10-07T11:35:40Z

src/api/providers/base-openai-compatible-provider.ts

+	protected isGLM46Model(modelId: string): boolean {
+		// Check for various GLM-4.6 model naming patterns
+		const lowerModel = modelId.toLowerCase()
+		return lowerModel.includes("glm-4.6") || lowerModel.includes("glm-4-6") || lowerModel === "glm-4.6"


P3: Minor simplification. The final equality check is redundant because .includes("glm-4.6") already covers it. A concise version improves readability.

Suggested change

return lowerModel.includes("glm-4.6") || lowerModel.includes("glm-4-6") || lowerModel === "glm-4.6"

protected isGLM46Model(modelId: string): boolean {

const lowerModel = modelId.toLowerCase()

return lowerModel.includes("glm-4.6") || lowerModel.includes("glm-4-6")

}

roomote · 2025-10-07T11:35:41Z

src/api/providers/__tests__/base-openai-compatible-provider.spec.ts

+import OpenAI from "openai"
+import { Anthropic } from "@anthropic-ai/sdk"
+
+import type { ModelInfo } from "@roo-code/types"


P3: Unused import ModelInfo can trigger no-unused-vars in stricter configs.

Suggested change

import type { ModelInfo } from "@roo-code/types"

ChicoPinto70 · 2025-10-10T10:56:53Z

Hi, Guys. I tried this PR with ik_llama.cpp running locally the Ubergarm's GLM-4.6 model with Unsloth's chat template (--jinja --chat-template-file) but I still don't have reasoning in the Roo Code. The reasoning works fine in Roo Code with DeepSeek v3.1 running locally and I also get GLM-4.6 reasoning in the ik_llama.cpp built-in webui and in the continue vscode extension. Do I need to do something else in order to make it works with this PR?

I'm using the openai compatible endpoint in Roo Code with "glm-4.6" alias in the ik_llama.cpp

roomote bot requested review from cte, jr and mrubens as code owners October 7, 2025 11:24

github-project-automation bot moved this to Triage in Roo Code Roadmap Oct 7, 2025

github-project-automation bot added this to Roo Code Roadmap and Roo Code Roadmap Oct 7, 2025

github-project-automation bot moved this to New in Roo Code Roadmap Oct 7, 2025

dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. bug Something isn't working labels Oct 7, 2025

ellipsis-dev bot reviewed Oct 7, 2025

View reviewed changes

hannesrudolph added the Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. label Oct 7, 2025

roomote bot mentioned this pull request Oct 7, 2025

[BUG] GLM-4.6 not generating thinking tokens when using OpenAI-compatible custom endpoint #8547

Open

roomote bot commented Oct 7, 2025

View reviewed changes

roomote bot mentioned this pull request Oct 13, 2025

fix: improve GLM-4.6 thinking token support for better compatibility #8643

Closed

daniel-lxs closed this Oct 28, 2025

github-project-automation bot moved this from New to Done in Roo Code Roadmap Oct 28, 2025

github-project-automation bot moved this from Triage to Done in Roo Code Roadmap Oct 28, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: add GLM-4.6 thinking token support for OpenAI-compatible endpoints #8548

fix: add GLM-4.6 thinking token support for OpenAI-compatible endpoints #8548

roomote bot commented Oct 7, 2025 •

edited by ellipsis-dev bot

Loading

Uh oh!

ellipsis-dev bot Oct 7, 2025

Uh oh!

roomote bot left a comment

Uh oh!

roomote bot Oct 7, 2025

Uh oh!

roomote bot Oct 7, 2025

Uh oh!

roomote bot Oct 7, 2025

Uh oh!

ChicoPinto70 commented Oct 10, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

	return lowerModel.includes("glm-4.6") \|\| lowerModel.includes("glm-4-6") \|\| lowerModel === "glm-4.6"
	return lowerModel.includes("glm-4.6") \|\| lowerModel.includes("glm-4-6")

-			if (delta && "reasoning_content" in delta && delta.reasoning_content) {
+			// Handle reasoning_content if present (avoid double-emitting when GLM '<think>' is parsed)
+			if (!isGLM46 && delta && "reasoning_content" in delta && delta.reasoning_content) {
+				yield {
+					type: "reasoning",
+					text: (delta.reasoning_content as string | undefined) || "",
+				}
+			}

fix: add GLM-4.6 thinking token support for OpenAI-compatible endpoints #8548

fix: add GLM-4.6 thinking token support for OpenAI-compatible endpoints #8548

Conversation

roomote bot commented Oct 7, 2025 • edited by ellipsis-dev bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Solution

Testing

Impact

Uh oh!

ellipsis-dev bot Oct 7, 2025

Choose a reason for hiding this comment

Uh oh!

roomote bot left a comment

Choose a reason for hiding this comment

Uh oh!

roomote bot Oct 7, 2025

Choose a reason for hiding this comment

Uh oh!

roomote bot Oct 7, 2025

Choose a reason for hiding this comment

Uh oh!

roomote bot Oct 7, 2025

Choose a reason for hiding this comment

Uh oh!

ChicoPinto70 commented Oct 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

roomote bot commented Oct 7, 2025 •

edited by ellipsis-dev bot

Loading

ChicoPinto70 commented Oct 10, 2025 •

edited

Loading