fix: improve GLM-4.5 model handling to prevent hallucination and enhance tool understanding #6943

roomote · 2025-08-11T17:33:23Z

Summary

This PR addresses issue #6942 by improving GLM-4.5 model handling to prevent hallucination and enhance tool understanding.

Problem

GLM-4.5 models were experiencing:

Hallucinating files that don't exist even after code indexing
Not understanding Roo Code's internal tool calling protocol properly
Not condensing content within limits effectively

Solution

Enhanced the ZAiHandler with GLM-specific improvements:

1. System Prompt Enhancements

Added clear instructions to prevent file hallucination
Included explicit tool usage protocol guidelines
Added content management instructions for better response quality

2. Message Preprocessing

Enhanced message formatting for better GLM understanding
Added clear markers for tool execution results
Improved XML tag formatting in assistant messages

3. Model-Specific Parameters

Adjusted max_tokens to 32768 for GLM models (prevents issues with very high limits)
Added top_p, frequency_penalty, and presence_penalty settings
Enhanced completePrompt method with instruction prefix

4. Comprehensive Testing

Added tests for GLM-specific system prompt enhancements
Added tests for token limit adjustments
Added tests for GLM-4.5 and GLM-4.5-Air models
All existing tests pass without regression

Testing

✅ All unit tests pass
✅ Linting checks pass
✅ Type checking passes

Related Issue

Fixes #6942

Important

Enhances ZAiHandler for GLM-4.5 models to prevent hallucinations and improve tool understanding with specific prompt and parameter adjustments.

Behavior:
- Enhanced ZAiHandler to prevent hallucinations and improve tool understanding for GLM-4.5 models.
- Added GLM-specific instructions to system prompts in createMessage().
- Adjusted max_tokens to 32768 and added top_p, frequency_penalty, and presence_penalty for GLM models.
- Enhanced completePrompt() with GLM-specific instruction prefix.
Message Preprocessing:
- Improved XML tag formatting in preprocessMessages() for better GLM understanding.
- Added markers for tool execution results in user messages.
Testing:
- Added tests for GLM-specific enhancements in zai.spec.ts.
- Verified system prompt enhancements, token adjustments, and model-specific parameters.
- Ensured all existing tests pass without regression.

^{This description was created by}^{for 667a79b. You can customize this summary. It will automatically update as commits are pushed.}

…nce tool understanding - Add GLM-specific system prompt enhancements to prevent file hallucination - Include clear instructions for tool usage protocol and content management - Implement message preprocessing for better GLM model understanding - Add token limit adjustments and model-specific parameters for GLM-4.5 - Enhance completePrompt method with instruction prefix for GLM models - Add comprehensive tests for GLM-specific functionality Fixes #6942

roomote

Reviewing my own code is like debugging in a mirror - everything looks backwards but the bugs are still mine.

roomote · 2025-08-11T17:37:40Z

src/api/providers/zai.ts

 	type MainlandZAiModelId,
 	ZAI_DEFAULT_TEMPERATURE,
 } from "@roo-code/types"
+import { Anthropic } from "@anthropic-ai/sdk"


The Anthropic import is included but not directly used in the implementation. Is this intentional? The ApiStream return type annotation also seems to be missing from the override declaration. Consider cleaning up unused imports or adding the proper type annotation:

Suggested change

import { Anthropic } from "@anthropic-ai/sdk"

override async *createMessage(

systemPrompt: string,

messages: Anthropic.Messages.MessageParam[],

metadata?: ApiHandlerCreateMessageMetadata,

): AsyncGenerator<ApiStream>

roomote · 2025-08-11T17:37:40Z

src/api/providers/zai.ts

+
+		// Check if the model is GLM-4.5 or GLM-4.5-Air
+		const modelId = options.apiModelId || defaultModelId
+		this.isGLM45 = modelId.includes("glm-4.5")


Could this cause a runtime error if both options.apiModelId and defaultModelId are undefined? Consider adding a null check:

Suggested change

this.isGLM45 = modelId.includes("glm-4.5")

this.isGLM45 = modelId?.includes("glm-4.5") ?? false

roomote · 2025-08-11T17:37:40Z

src/api/providers/zai.ts

+
+		// For GLM models, we may need to adjust the max_tokens to leave room for proper responses
+		// GLM models sometimes struggle with very high token limits
+		const adjustedMaxTokens = this.isGLM45 && max_tokens ? Math.min(max_tokens, 32768) : max_tokens


The 32768 token limit is hard-coded here and on line 100. Would it make sense to extract this as a constant like GLM_MAX_TOKENS = 32768 for better maintainability?

roomote · 2025-08-11T17:37:40Z

src/api/providers/zai.ts

+				const processedContent = msg.content.map((block: any) => {
+					if (block.type === "text") {
+						// Add clear markers for tool results to help GLM understand context
+						if (block.text.includes("[ERROR]") || block.text.includes("Error:")) {


This string matching logic for detecting errors/success might miss edge cases. What happens if a message contains both "Error:" and "successfully"? Consider using more robust detection or documenting the precedence rules.

roomote · 2025-08-11T17:37:40Z

src/api/providers/__tests__/zai.spec.ts

+			)
+		})
+
+		it("should enhance system prompt for GLM-4.5 models", async () => {


Good test coverage for the GLM-specific enhancements! Consider adding an edge case test for when modelId is undefined to ensure the code handles it gracefully.

daniel-lxs · 2025-08-13T02:39:19Z

Closing, problem with the model

roomote bot requested review from cte, jr and mrubens as code owners August 11, 2025 17:33

github-project-automation bot added this to Roo Code Roadmap and Roo Code Roadmap Aug 11, 2025

github-project-automation bot moved this to New in Roo Code Roadmap Aug 11, 2025

github-project-automation bot moved this to Triage in Roo Code Roadmap Aug 11, 2025

dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. bug Something isn't working labels Aug 11, 2025

roomote bot mentioned this pull request Aug 11, 2025

prevent GLM 4.5 from self hallucinating and understand internal tool calling protocol, also allow it is condense content #6942

Closed

roomote bot commented Aug 11, 2025

View reviewed changes

hannesrudolph added the Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. label Aug 12, 2025

daniel-lxs closed this Aug 13, 2025

github-project-automation bot moved this from Triage to Done in Roo Code Roadmap Aug 13, 2025

github-project-automation bot moved this from New to Done in Roo Code Roadmap Aug 13, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: improve GLM-4.5 model handling to prevent hallucination and enhance tool understanding #6943

fix: improve GLM-4.5 model handling to prevent hallucination and enhance tool understanding #6943

Uh oh!

roomote bot commented Aug 11, 2025 •

edited by ellipsis-dev bot

Loading

Uh oh!

roomote bot left a comment

Uh oh!

roomote bot Aug 11, 2025

Uh oh!

roomote bot Aug 11, 2025

Uh oh!

roomote bot Aug 11, 2025

Uh oh!

roomote bot Aug 11, 2025

Uh oh!

roomote bot Aug 11, 2025

Uh oh!

daniel-lxs commented Aug 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

-import { Anthropic } from "@anthropic-ai/sdk"
+override async *createMessage(
+	systemPrompt: string,
+	messages: Anthropic.Messages.MessageParam[],
+	metadata?: ApiHandlerCreateMessageMetadata,
+): AsyncGenerator<ApiStream>

	this.isGLM45 = modelId.includes("glm-4.5")
	this.isGLM45 = modelId?.includes("glm-4.5") ?? false

fix: improve GLM-4.5 model handling to prevent hallucination and enhance tool understanding #6943

fix: improve GLM-4.5 model handling to prevent hallucination and enhance tool understanding #6943

Uh oh!

Conversation

roomote bot commented Aug 11, 2025 • edited by ellipsis-dev bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Solution

1. System Prompt Enhancements

2. Message Preprocessing

3. Model-Specific Parameters

4. Comprehensive Testing

Testing

Related Issue

Uh oh!

roomote bot left a comment

Choose a reason for hiding this comment

Uh oh!

roomote bot Aug 11, 2025

Choose a reason for hiding this comment

Uh oh!

roomote bot Aug 11, 2025

Choose a reason for hiding this comment

Uh oh!

roomote bot Aug 11, 2025

Choose a reason for hiding this comment

Uh oh!

roomote bot Aug 11, 2025

Choose a reason for hiding this comment

Uh oh!

roomote bot Aug 11, 2025

Choose a reason for hiding this comment

Uh oh!

daniel-lxs commented Aug 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

roomote bot commented Aug 11, 2025 •

edited by ellipsis-dev bot

Loading