Replies: 1 comment
-
Hi mate, I found 2 ways to handle this. 1. Set extract_reasoning param as True when creating an instance of ChatOllamaLike this:
This will make the response from llm.invoke() have a separate part of the real answer and a thinking block:
2. Set "think" param as False when invoke the llmLike this:
This will disable think mode from the model and make the response faster and less token
|
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
After version 0.9, Ollama supports turning off the Think mode for large language models. You can disable it by entering /set nothink in the command line or by passing the parameter think=False.
I'm not sure how to set this parameter (think=False) when using Ollama within a LangGraph Agent. Does anyone know how to configure this? Please help. Thanks.
Beta Was this translation helpful? Give feedback.
All reactions