Support for Gemini 3 thinking_level parameter to reduce latency #608

luke1129 · 2026-02-27T19:34:42Z

luke1129
Feb 27, 2026

Thank you for the v1.6.0 update, I'm finding the addition of structured output a great step forwards.

I'm using LLM Vision for driveway motion analysis. Ahead of the Gemini Flash 2.5-Lite shutdown, I have tried using Gemini 3 Flash but am seeing greater latency, i.e. several seconds more to get a response vs 2.5 Lite. I believe one of the drivers is the model defaulting to "Dynamic/High" reasoning, which is overkill for simple person/vehicle identification.

The Request:
Would it be possible to add support for the thinking_level parameter in the service call configuration?

According to the Google API docs, Gemini 3 Flash supports a MINIMAL level which is designed specifically to minimise latency for tasks like this. Adding a toggle in the UI (or simply allowing it as a parameter in the YAML) would allow us to "turn off" unnecessary reasoning and get back to the sub-2-second response times seen with lite models.

Testing the models in AI Studio with my driveway motion prompts, Flash 3 is significantly faster when thinking is turned off/reduced.

zipzagster · 2026-02-28T21:27:57Z

zipzagster
Feb 28, 2026

You can try within the prompt something like "Provide a direct answer without explaining your process" or "Prioritize speed over depth". Based on recent research I would try putting it first and then repeating it at the end.

Of course the API would be better

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support for Gemini 3 thinking_level parameter to reduce latency #608

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

Support for Gemini 3 thinking_level parameter to reduce latency #608

Uh oh!

Uh oh!

luke1129 Feb 27, 2026

Replies: 1 comment

Uh oh!

zipzagster Feb 28, 2026

luke1129
Feb 27, 2026

zipzagster
Feb 28, 2026