fix: prevent duplicate BOS tokens with DeepSeek V3.1 in llama.cpp #7501

roomote · 2025-08-28T17:41:18Z

This PR addresses Issue #7500 by adding a configuration option to prevent duplicate BOS tokens when using DeepSeek V3.1 with llama.cpp.

Problem

When using DeepSeek V3.1 through OpenAI Compatible provider with llama.cpp (with --jinja flag enabled), users were getting a warning about duplicate BOS tokens. This happens because llama.cpp automatically adds BOS tokens, but Roo Code was also sending messages in a format that triggered another BOS token addition.

Solution

Added a new configuration option openAiSkipSystemMessage that, when enabled for DeepSeek models:

Merges the system prompt into the first user message instead of sending it as a separate system message
Prevents the duplicate BOS token issue
Maintains backward compatibility as an opt-in feature

Changes

Added openAiSkipSystemMessage boolean option to OpenAI provider settings schema
Updated OpenAI handler to detect DeepSeek models and apply the skip logic when configured
Handles both streaming and non-streaming modes consistently
Added comprehensive test coverage for the new functionality

Testing

✅ All new tests pass (10 test cases covering various scenarios)
✅ All existing OpenAI provider tests pass (no regression)
✅ All existing DeepSeek provider tests pass
✅ Linting and type checking pass

Usage

Users experiencing the duplicate BOS token issue with DeepSeek V3.1 and llama.cpp can enable the openAiSkipSystemMessage option in their OpenAI Compatible provider configuration.

Fixes #7500

Important

Adds openAiSkipSystemMessage option to prevent duplicate BOS tokens with DeepSeek V3.1 in openai.ts, with comprehensive tests.

Behavior:
- Adds openAiSkipSystemMessage option to prevent duplicate BOS tokens with DeepSeek V3.1 in openai.ts.
- Merges system prompt into the first user message when enabled, avoiding separate system messages.
- Applies to both streaming and non-streaming modes.
Configuration:
- Updates openAiSchema in provider-settings.ts to include openAiSkipSystemMessage.
Testing:
- Adds openai-deepseek-bos.spec.ts with 10 test cases for various scenarios, ensuring correct behavior with and without the new option.
Misc:
- Updates OpenAiHandler in openai.ts to detect DeepSeek models and apply skip logic when configured.

^{This description was created by}^{for bd283b7. You can customize this summary. It will automatically update as commits are pushed.}

…ns with DeepSeek V3.1 - Added openAiSkipSystemMessage configuration option for OpenAI Compatible providers - When enabled for DeepSeek models, merges system prompt into first user message - Prevents duplicate BOS tokens when using llama.cpp with --jinja flag - Added comprehensive tests for the new functionality Fixes #7500

roomote

Reviewing my own code is like debugging in a mirror - everything looks backward but the bugs are still mine.

roomote · 2025-08-28T17:46:08Z

src/api/providers/openai.ts

+
 			if (deepseekReasoner) {
 				convertedMessages = convertToR1Format([{ role: "user", content: systemPrompt }, ...messages])
+			} else if (skipSystemMessage) {


I notice there's duplicate logic here between streaming (lines 108-129) and non-streaming (lines 248-268) modes. Could we extract this into a helper method like prepareMessagesWithSkipSystemMessage() to reduce duplication and improve maintainability?

roomote · 2025-08-28T17:46:08Z

src/api/providers/openai.ts

+			// Check if we should skip system message for DeepSeek V3 models with llama.cpp
+			const skipSystemMessage =
+				this.options.openAiSkipSystemMessage &&
+				(modelId.toLowerCase().includes("deepseek") || modelId.toLowerCase().includes("deepseek-v3"))


The model detection using includes("deepseek") might be too broad and could match unintended models. Would it be more robust to use a specific list of model IDs or a regex pattern?

roomote · 2025-08-28T17:46:08Z

packages/types/src/provider-settings.ts

 	openAiStreamingEnabled: z.boolean().optional(),
 	openAiHostHeader: z.string().optional(), // Keep temporarily for backward compatibility during migration.
 	openAiHeaders: z.record(z.string(), z.string()).optional(),
+	openAiSkipSystemMessage: z.boolean().optional(), // Skip system message for models that auto-add BOS tokens (e.g., llama.cpp with --jinja)


This comment is helpful, but could we expand it to explain when users should enable this option? For example: 'Enable this if you see duplicate BOS token warnings with DeepSeek V3.1 and llama.cpp'

roomote · 2025-08-28T17:46:08Z

src/api/providers/__tests__/openai-deepseek-bos.spec.ts

+
+vi.mock("openai")
+
+describe("OpenAI Handler - DeepSeek V3 BOS Token Handling", () => {


Great test coverage! Consider adding a few edge cases:

What happens when the system prompt is empty?

Behavior with complex message content (arrays with multiple text/image parts)?

Interaction with R1 format when both openAiR1FormatEnabled and openAiSkipSystemMessage are true?

roomote · 2025-08-28T17:46:08Z

src/api/providers/openai.ts

 			let convertedMessages

+			// Check if we should skip system message for DeepSeek V3 models with llama.cpp
+			const skipSystemMessage =


Instead of hardcoding this for DeepSeek, could this feature be useful for other llama.cpp deployments? Consider renaming the option to something more generic like mergeSystemIntoFirstUser to indicate the behavior rather than the specific use case.

daniel-lxs · 2025-09-11T22:50:11Z

Closing this PR as the approach has fundamental limitations. The core issue is that we cannot reliably detect if llama.cpp is being used at runtime - we can only guess based on model names, which is not a sustainable solution. Merging system messages into user messages also changes the semantic structure of the conversation in ways that could affect model behavior.

roomote bot requested review from cte, jr and mrubens as code owners August 28, 2025 17:41

github-project-automation bot added this to Roo Code Roadmap and Roo Code Roadmap Aug 28, 2025

github-project-automation bot moved this to New in Roo Code Roadmap Aug 28, 2025

github-project-automation bot moved this to Triage in Roo Code Roadmap Aug 28, 2025

dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. bug Something isn't working labels Aug 28, 2025

roomote bot commented Aug 28, 2025

View reviewed changes

hannesrudolph added the Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. label Aug 28, 2025

roomote bot mentioned this pull request Aug 28, 2025

Getting "final prompt starts with 2 BOS tokens" warning with DeepSeek V3.1 #7500

Closed

daniel-lxs moved this from Triage to PR [Needs Prelim Review] in Roo Code Roadmap Aug 29, 2025

hannesrudolph added PR - Needs Preliminary Review and removed Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. labels Aug 29, 2025

daniel-lxs closed this Sep 11, 2025

github-project-automation bot moved this from PR [Needs Prelim Review] to Done in Roo Code Roadmap Sep 11, 2025

github-project-automation bot moved this from New to Done in Roo Code Roadmap Sep 11, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: prevent duplicate BOS tokens with DeepSeek V3.1 in llama.cpp #7501

fix: prevent duplicate BOS tokens with DeepSeek V3.1 in llama.cpp #7501

Uh oh!

roomote bot commented Aug 28, 2025 •

edited by ellipsis-dev bot

Loading

Uh oh!

roomote bot left a comment

Uh oh!

roomote bot Aug 28, 2025

Uh oh!

roomote bot Aug 28, 2025

Uh oh!

roomote bot Aug 28, 2025

Uh oh!

roomote bot Aug 28, 2025

Uh oh!

roomote bot Aug 28, 2025

Uh oh!

daniel-lxs commented Sep 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants


		vi.mock("openai")

		describe("OpenAI Handler - DeepSeek V3 BOS Token Handling", () => {

fix: prevent duplicate BOS tokens with DeepSeek V3.1 in llama.cpp #7501

fix: prevent duplicate BOS tokens with DeepSeek V3.1 in llama.cpp #7501

Uh oh!

Conversation

roomote bot commented Aug 28, 2025 • edited by ellipsis-dev bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Solution

Changes

Testing

Usage

Uh oh!

roomote bot left a comment

Choose a reason for hiding this comment

Uh oh!

roomote bot Aug 28, 2025

Choose a reason for hiding this comment

Uh oh!

roomote bot Aug 28, 2025

Choose a reason for hiding this comment

Uh oh!

roomote bot Aug 28, 2025

Choose a reason for hiding this comment

Uh oh!

roomote bot Aug 28, 2025

Choose a reason for hiding this comment

Uh oh!

roomote bot Aug 28, 2025

Choose a reason for hiding this comment

Uh oh!

daniel-lxs commented Sep 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

roomote bot commented Aug 28, 2025 •

edited by ellipsis-dev bot

Loading