[REQUIRES UPDATE] Agent responses using Ollama (qwen3-30B-A3B) are entirely rendered in "thoughts" block #7201
Replies: 5 comments 15 replies
-
Beta Was this translation helpful? Give feedback.
-
Hello, here for me in reasoning models it is being shown this way. |
Beta Was this translation helpful? Give feedback.
-
Had similar issue using qwen3 with LM Studio and was able to fix it by turning OFF “Reasoning Section Parsing” on LM Studio side and let LC decide when to apply those tags. |
Beta Was this translation helpful? Give feedback.
-
So the root of this issue is that Ollama (and maybe other local LLM options) is streaming the entire response in a single chunk for the stream /chat/completions generation. The stream handling is not designed to parse I'm testing a fix and will merge today. |
Beta Was this translation helpful? Give feedback.
-
Closed by #7275 |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
What happened?
When using the Ollama provider with the qwen3_30ba3b_40k:latest model in an agent (think mode enabled), the entire model response is incorrectly rendered inside the "thoughts" block instead of only internal reasoning.
This doesn't happen when using the same model outside of agent mode in a regular chat.
Version Information
LibreChat v0.7.7
Model: qwen3_30ba3b
Provider: Ollama
Running as: Agent with think mode enabled
Steps to Reproduce
What browsers are you seeing the problem on?
Chrome
Relevant log output
No backend error logs observed; issue appears to be related to how frontend parses agent + think responses.
Screenshots
Code of Conduct
Beta Was this translation helpful? Give feedback.
All reactions