fix: ollama performance #12526

Pleasurecruise · 2026-01-20T01:01:10Z

Note

This issue/comment/review was translated by Claude.

ollama-ai-provider-v2 itself only accepts boolean values for think, causing the passed "thinking depth/length" settings to be filtered out during validation. This makes parseProviderOptions' zod validation discard 'low' | 'medium' | 'high', so the thinking depth cannot take effect. The latest version has been changed to boolean | 'low' | 'medium' | 'high' and exports ollamaProviderOptions, but the current project has too deep dependency levels to upgrade (3.0.1), so a patch must be used to backport it.
When ollama-ai-provider-v2 returns streaming data, some models will output text chunks first and then reasoning chunks when reasoning is enabled. Therefore, ollamaReasoningOrderMiddleware is added and injected when ollama reasoning is enabled; during streaming, it will buffer text first, and when reasoning chunks appear, place reasoning in front to ensure that thinking blocks are displayed before the main text.

Original Content

ollama-ai-provider-v2 本身对 think 只接受布尔值，导致传入的 "思考深度/长度" 设置被校验过滤掉。这会让 parseProviderOptions 的 zod 校验丢弃 'low' | 'medium' | 'high'，于是思考深度无法生效。最新版已改为 boolean | 'low' | 'medium' | 'high' 并导出ollamaProviderOptions，但当前项目依赖层级过深无法升级（3.0.1），所以只能用补丁回补。
ollama-ai-provider-v2 本身流式返回时，部分模型在开启推理时会先输出文本chunk、后输出 reasoning chunk
所以新增 ollamaReasoningOrderMiddleware，并在 ollama 开启推理时注入；流式时会先缓冲文本，等到 reasoning chunk 出现后把 reasoning 放前面，保证思考块先于正文显示。

…nd export options

…ling

Copilot

Pull request overview

This pull request addresses Ollama performance issues related to reasoning/thinking functionality by:

Patching ollama-ai-provider-v2 to support reasoning effort levels ('low', 'medium', 'high') in addition to boolean values for the think parameter, as the current dependency version only accepts boolean values
Adding middleware to reorder reasoning and text chunks in streaming responses, ensuring reasoning content appears before response text for better UX
Updating provider options to properly handle the new reasoning effort values

Changes:

Modified buildOllamaProviderOptions to use the new OllamaProviderOptions type and set the think parameter based on reasoning effort for gpt-oss models
Created new middleware ollamaReasoningOrderMiddleware to buffer and reorder stream chunks so reasoning appears before text
Updated patch for ollama-ai-provider-v2@1.5.5 to backport support for string reasoning effort values and export OllamaProviderOptions

Reviewed changes

Copilot reviewed 5 out of 6 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
src/renderer/src/aiCore/utils/options.ts	Changed import from `OllamaCompletionProviderOptions` to `OllamaProviderOptions` and updated logic for setting the `think` parameter based on reasoning effort
src/renderer/src/aiCore/middleware/ollamaReasoningOrderMiddleware.ts	New middleware that buffers text chunks and reorders them to appear after reasoning chunks in streaming responses
src/renderer/src/aiCore/middleware/AiSdkMiddlewareBuilder.ts	Integrated the new Ollama reasoning order middleware for cases when reasoning is enabled
patches/ollama-ai-provider-v2@1.5.5.patch	Patched the ollama-ai-provider-v2 library to support 'low', 'medium', 'high' values for the `think` parameter and export `OllamaProviderOptions`
pnpm-lock.yaml	Updated patch hash for ollama-ai-provider-v2 to reflect the new patch content
package.json	Updated patch file path for ollama-ai-provider-v2 to use the new naming convention

Files not reviewed (1)

pnpm-lock.yaml: Language not supported

Comments suppressed due to low confidence (1)

patches/ollama-ai-provider-v2@1.5.5.patch:102

The patch moves temperature and top_p into the options object without checking if they are undefined. If these values are undefined, they will be explicitly set as undefined in the options object, which could cause issues with the Ollama API. Consider using conditional spreading like the original code did, or add a check to only include these values when they are defined.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-20T01:05:30Z

src/renderer/src/aiCore/utils/options.ts

      }
    } else {
-      providerOptions.think = !['none', undefined].includes(reasoningEffort)
+      providerOptions.think = reasoningEffort !== 'none'


The condition reasoningEffort !== 'none' will evaluate to true when reasoningEffort is undefined, causing providerOptions.think to be set to true even when reasoning effort is not explicitly configured. This should explicitly check for valid values to avoid unintended behavior. Consider using a condition like reasoningEffort && reasoningEffort !== 'none' or checking against specific valid values.

Suggested change

providerOptions.think = reasoningEffort !== 'none'

providerOptions.think = !!reasoningEffort && reasoningEffort !== 'none'

Copilot · 2026-01-20T01:05:30Z

src/renderer/src/aiCore/middleware/ollamaReasoningOrderMiddleware.ts

+              if (isReasoningPart(chunk)) {
+                hasReasoning = true
+                controller.enqueue(chunk)
+                flushBufferedText(controller)
+                return


The buffering logic has an issue: when text chunks are buffered and then reasoning chunks arrive, the buffered text is flushed immediately after the first reasoning chunk is enqueued. This causes the text to appear between reasoning chunks rather than after all reasoning chunks. For example, if the order is text-start, text-delta, reasoning-start, reasoning-delta, reasoning-end, the output will be reasoning-start, text-start, text-delta, reasoning-delta, reasoning-end. The flush should happen after all reasoning chunks, not after the first one. Consider moving the flushBufferedText call to only execute after reasoning-end or tracking when reasoning has fully completed.

Copilot · 2026-01-20T01:05:30Z

src/renderer/src/aiCore/middleware/ollamaReasoningOrderMiddleware.ts

+          new TransformStream<LanguageModelV2StreamPart, LanguageModelV2StreamPart>({
+            transform(chunk, controller) {
+              if (isReasoningPart(chunk)) {
+                hasReasoning = true
+                controller.enqueue(chunk)
+                flushBufferedText(controller)
+                return
+              }
+
+              if (!hasReasoning && isTextPart(chunk)) {
+                bufferedText.push(chunk)
+                return
+              }
+
+              if (!hasReasoning && (chunk.type === 'finish' || chunk.type === 'error')) {
+                flushBufferedText(controller)
+              }
+
+              controller.enqueue(chunk)
+            },
+            flush(controller) {
+              flushBufferedText(controller)
+            }
+          })


Superfluous argument passed to function TransformStream.

Pleasurecruise · 2026-01-20T01:10:40Z

maybe there are still issues that need to be modified further

DeJeune · 2026-01-20T03:23:33Z

Note

This issue/comment/review was translated by Claude.

I feel the middleware wasn't modified correctly here. The second issue should modify the timing of reasoning-end. Please refer to: vercel/ai#11751

Original Content

感觉中间件这里没有改对，第二个问题应该修改reasoning-end的结束时机，参考一下：vercel/ai#11751

Pleasurecruise · 2026-01-20T10:20:47Z

i tried making some adjustments, still need help to check if there are any omissions in the patch modifications. These two issues do indeed exist.

DeJeune · 2026-01-20T10:27:23Z

Note

This issue/comment/review was translated by Claude.

Can I modify the patch directly?

Original Content

直接修改patch可以吗

Pleasurecruise · 2026-01-20T10:37:21Z

Note

This issue/comment/review was translated by Claude.

I'm not sure. First of all, the patch makes too many changes. Second, if I only modify the place below:

  processDelta(delta, controller) {
      this.processTextContent(delta, controller);   // Process text first
      this.processThinking(delta, controller);       // Then process thinking/reasoning
      this.processToolCalls(delta, controller);
  }

It causes the thinking block to only display one character. It seems like the output is being processed repeatedly.

Original Content

不太清楚首先就是patch进行改动的地方太多了其次如果只是在下面这个地方修改

  processDelta(delta, controller) {
      this.processTextContent(delta, controller);   // 先处理 text
      this.processThinking(delta, controller);       // 后处理 thinking/reasoning
      this.processToolCalls(delta, controller);
  }

会导致思考块只显示一个字好像是因为对输出重复处理了

kangfenmao · 2026-01-23T12:42:14Z

CI Failed

Pleasurecruise · 2026-01-23T23:03:24Z

Note

This issue/comment/review was translated by Claude.

@kangfenmao The occasional bug caused by race conditions has now passed. I feel that the ollama-ai-provider-v2 dependency requires quite a lot of changes. I'm not sure if it's necessary to upgrade it or create a separate cherrystudio/ollama-ai-provider-v2

Original Content

@kangfenmao 竞态导致的偶发bug现在已经pass了 ollama-ai-provider-v2这个依赖感觉需要改动的挺多的不知道有没有必要升级一下或者单独弄一个cherrystudio/ollama-ai-provider-v2

Pleasurecruise added 2 commits January 19, 2026 22:35

fix: update ollama-ai-provider-v2 patch to improve type definitions a…

fc1a94c

…nd export options

fix: add ollama reasoning order middleware for improved response hand…

31ee729

…ling

Copilot AI review requested due to automatic review settings January 20, 2026 01:01

Copilot started reviewing on behalf of Pleasurecruise January 20, 2026 01:01 View session

Copilot AI reviewed Jan 20, 2026

View reviewed changes

Pleasurecruise requested review from DeJeune, EurFelux and eeee0717 January 20, 2026 10:20

Update ollamaReasoningOrderMiddleware.ts

5847fe4

DeJeune self-assigned this Jan 26, 2026

DeJeune added this to the v1.7.16 milestone Jan 26, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: ollama performance #12526

fix: ollama performance #12526

Pleasurecruise commented Jan 20, 2026 •

edited by kangfenmao

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Jan 20, 2026

Uh oh!

Copilot AI Jan 20, 2026

Uh oh!

Copilot AI Jan 20, 2026

Uh oh!

Pleasurecruise commented Jan 20, 2026

Uh oh!

DeJeune commented Jan 20, 2026 •

edited by kangfenmao

Loading

Uh oh!

Pleasurecruise commented Jan 20, 2026

Uh oh!

DeJeune commented Jan 20, 2026 •

edited by kangfenmao

Loading

Uh oh!

Pleasurecruise commented Jan 20, 2026 •

edited by kangfenmao

Loading

Uh oh!

kangfenmao commented Jan 23, 2026

Uh oh!

Pleasurecruise commented Jan 23, 2026 •

edited by kangfenmao

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

	providerOptions.think = reasoningEffort !== 'none'
	providerOptions.think = !!reasoningEffort && reasoningEffort !== 'none'

fix: ollama performance #12526

Are you sure you want to change the base?

fix: ollama performance #12526

Conversation

Pleasurecruise commented Jan 20, 2026 • edited by kangfenmao Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

Pleasurecruise commented Jan 20, 2026

Uh oh!

DeJeune commented Jan 20, 2026 • edited by kangfenmao Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Pleasurecruise commented Jan 20, 2026

Uh oh!

DeJeune commented Jan 20, 2026 • edited by kangfenmao Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Pleasurecruise commented Jan 20, 2026 • edited by kangfenmao Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kangfenmao commented Jan 23, 2026

Uh oh!

Pleasurecruise commented Jan 23, 2026 • edited by kangfenmao Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Pleasurecruise commented Jan 20, 2026 •

edited by kangfenmao

Loading

DeJeune commented Jan 20, 2026 •

edited by kangfenmao

Loading

DeJeune commented Jan 20, 2026 •

edited by kangfenmao

Loading

Pleasurecruise commented Jan 20, 2026 •

edited by kangfenmao

Loading

Pleasurecruise commented Jan 23, 2026 •

edited by kangfenmao

Loading