Skip to content

Conversation

@Pleasurecruise
Copy link
Collaborator

@Pleasurecruise Pleasurecruise commented Jan 20, 2026

Note

This issue/comment/review was translated by Claude.

  1. ollama-ai-provider-v2 itself only accepts boolean values for think, causing the passed "thinking depth/length" settings to be filtered out during validation. This makes parseProviderOptions' zod validation discard 'low' | 'medium' | 'high', so the thinking depth cannot take effect. The latest version has been changed to boolean | 'low' | 'medium' | 'high' and exports ollamaProviderOptions, but the current project has too deep dependency levels to upgrade (3.0.1), so a patch must be used to backport it.

  2. When ollama-ai-provider-v2 returns streaming data, some models will output text chunks first and then reasoning chunks when reasoning is enabled. Therefore, ollamaReasoningOrderMiddleware is added and injected when ollama reasoning is enabled; during streaming, it will buffer text first, and when reasoning chunks appear, place reasoning in front to ensure that thinking blocks are displayed before the main text.


Original Content
  1. ollama-ai-provider-v2 本身对 think 只接受布尔值,导致传入的 "思考深度/长度" 设置被校验过滤掉。这会让 parseProviderOptions 的 zod 校验丢弃 'low' | 'medium' | 'high',于是思考深度无法生效。最新版已改为 boolean | 'low' | 'medium' | 'high' 并导出ollamaProviderOptions,但当前项目依赖层级过深无法升级(3.0.1),所以只能用补丁回补。

  2. ollama-ai-provider-v2 本身流式返回时,部分模型在开启推理时会先输出文本chunk、后输出 reasoning chunk
    所以新增 ollamaReasoningOrderMiddleware,并在 ollama 开启推理时注入;流式时会先缓冲文本,等到 reasoning chunk 出现后把 reasoning 放前面,保证思考块先于正文显示。

Copilot AI review requested due to automatic review settings January 20, 2026 01:01
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request addresses Ollama performance issues related to reasoning/thinking functionality by:

  1. Patching ollama-ai-provider-v2 to support reasoning effort levels ('low', 'medium', 'high') in addition to boolean values for the think parameter, as the current dependency version only accepts boolean values
  2. Adding middleware to reorder reasoning and text chunks in streaming responses, ensuring reasoning content appears before response text for better UX
  3. Updating provider options to properly handle the new reasoning effort values

Changes:

  • Modified buildOllamaProviderOptions to use the new OllamaProviderOptions type and set the think parameter based on reasoning effort for gpt-oss models
  • Created new middleware ollamaReasoningOrderMiddleware to buffer and reorder stream chunks so reasoning appears before text
  • Updated patch for ollama-ai-provider-v2@1.5.5 to backport support for string reasoning effort values and export OllamaProviderOptions

Reviewed changes

Copilot reviewed 5 out of 6 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
src/renderer/src/aiCore/utils/options.ts Changed import from OllamaCompletionProviderOptions to OllamaProviderOptions and updated logic for setting the think parameter based on reasoning effort
src/renderer/src/aiCore/middleware/ollamaReasoningOrderMiddleware.ts New middleware that buffers text chunks and reorders them to appear after reasoning chunks in streaming responses
src/renderer/src/aiCore/middleware/AiSdkMiddlewareBuilder.ts Integrated the new Ollama reasoning order middleware for cases when reasoning is enabled
patches/ollama-ai-provider-v2@1.5.5.patch Patched the ollama-ai-provider-v2 library to support 'low', 'medium', 'high' values for the think parameter and export OllamaProviderOptions
pnpm-lock.yaml Updated patch hash for ollama-ai-provider-v2 to reflect the new patch content
package.json Updated patch file path for ollama-ai-provider-v2 to use the new naming convention
Files not reviewed (1)
  • pnpm-lock.yaml: Language not supported
Comments suppressed due to low confidence (1)

patches/ollama-ai-provider-v2@1.5.5.patch:102

  • The patch moves temperature and top_p into the options object without checking if they are undefined. If these values are undefined, they will be explicitly set as undefined in the options object, which could cause issues with the Ollama API. Consider using conditional spreading like the original code did, or add a check to only include these values when they are defined.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

}
} else {
providerOptions.think = !['none', undefined].includes(reasoningEffort)
providerOptions.think = reasoningEffort !== 'none'
Copy link

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The condition reasoningEffort !== 'none' will evaluate to true when reasoningEffort is undefined, causing providerOptions.think to be set to true even when reasoning effort is not explicitly configured. This should explicitly check for valid values to avoid unintended behavior. Consider using a condition like reasoningEffort && reasoningEffort !== 'none' or checking against specific valid values.

Suggested change
providerOptions.think = reasoningEffort !== 'none'
providerOptions.think = !!reasoningEffort && reasoningEffort !== 'none'

Copilot uses AI. Check for mistakes.
Comment on lines 44 to 48
if (isReasoningPart(chunk)) {
hasReasoning = true
controller.enqueue(chunk)
flushBufferedText(controller)
return
Copy link

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The buffering logic has an issue: when text chunks are buffered and then reasoning chunks arrive, the buffered text is flushed immediately after the first reasoning chunk is enqueued. This causes the text to appear between reasoning chunks rather than after all reasoning chunks. For example, if the order is text-start, text-delta, reasoning-start, reasoning-delta, reasoning-end, the output will be reasoning-start, text-start, text-delta, reasoning-delta, reasoning-end. The flush should happen after all reasoning chunks, not after the first one. Consider moving the flushBufferedText call to only execute after reasoning-end or tracking when reasoning has fully completed.

Copilot uses AI. Check for mistakes.
Comment on lines 42 to 65
new TransformStream<LanguageModelV2StreamPart, LanguageModelV2StreamPart>({
transform(chunk, controller) {
if (isReasoningPart(chunk)) {
hasReasoning = true
controller.enqueue(chunk)
flushBufferedText(controller)
return
}

if (!hasReasoning && isTextPart(chunk)) {
bufferedText.push(chunk)
return
}

if (!hasReasoning && (chunk.type === 'finish' || chunk.type === 'error')) {
flushBufferedText(controller)
}

controller.enqueue(chunk)
},
flush(controller) {
flushBufferedText(controller)
}
})
Copy link

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Superfluous argument passed to function TransformStream.

Copilot uses AI. Check for mistakes.
@Pleasurecruise
Copy link
Collaborator Author

maybe there are still issues that need to be modified further

@DeJeune
Copy link
Collaborator

DeJeune commented Jan 20, 2026

Note

This issue/comment/review was translated by Claude.

I feel the middleware wasn't modified correctly here. The second issue should modify the timing of reasoning-end. Please refer to: vercel/ai#11751


Original Content

感觉中间件这里没有改对,第二个问题应该修改reasoning-end的结束时机,参考一下:vercel/ai#11751

@Pleasurecruise
Copy link
Collaborator Author

i tried making some adjustments, still need help to check if there are any omissions in the patch modifications. These two issues do indeed exist.

@DeJeune
Copy link
Collaborator

DeJeune commented Jan 20, 2026

Note

This issue/comment/review was translated by Claude.

Can I modify the patch directly?


Original Content

直接修改patch可以吗

@Pleasurecruise
Copy link
Collaborator Author

Pleasurecruise commented Jan 20, 2026

Note

This issue/comment/review was translated by Claude.

I'm not sure. First of all, the patch makes too many changes. Second, if I only modify the place below:

  processDelta(delta, controller) {
      this.processTextContent(delta, controller);   // Process text first
      this.processThinking(delta, controller);       // Then process thinking/reasoning
      this.processToolCalls(delta, controller);
  }

It causes the thinking block to only display one character. It seems like the output is being processed repeatedly.


Original Content

不太清楚 首先就是patch进行改动的地方太多了 其次如果只是在下面这个地方修改

  processDelta(delta, controller) {
      this.processTextContent(delta, controller);   // 先处理 text
      this.processThinking(delta, controller);       // 后处理 thinking/reasoning
      this.processToolCalls(delta, controller);
  }

会导致思考块只显示一个字 好像是因为对输出重复处理了

@kangfenmao
Copy link
Collaborator

CI Failed

@Pleasurecruise
Copy link
Collaborator Author

Pleasurecruise commented Jan 23, 2026

Note

This issue/comment/review was translated by Claude.

@kangfenmao The occasional bug caused by race conditions has now passed. I feel that the ollama-ai-provider-v2 dependency requires quite a lot of changes. I'm not sure if it's necessary to upgrade it or create a separate cherrystudio/ollama-ai-provider-v2


Original Content

@kangfenmao 竞态导致的偶发bug现在已经pass了 ollama-ai-provider-v2这个依赖感觉需要改动的挺多的 不知道有没有必要升级一下或者单独弄一个cherrystudio/ollama-ai-provider-v2

@DeJeune DeJeune self-assigned this Jan 26, 2026
@DeJeune DeJeune added this to the v1.7.16 milestone Jan 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants