You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: integrations/llms/gemini.mdx
+47-4Lines changed: 47 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -238,8 +238,19 @@ Grounding is invoked by passing the `google_search` tool (for newer models like
238
238
If you mix regular tools with grounding tools, vertex might throw an error saying only one tool can be used at a time.
239
239
</Warning>
240
240
241
-
## thinking models
242
241
242
+
## Extended Thinking (Reasoning Models) (Beta)
243
+
244
+
<Note>
245
+
The assistants thinking response is returned in the `response_chunk.choices[0].delta.content_blocks` array, not the `response.choices[0].message.content` string.
246
+
</Note>
247
+
248
+
Models like `gemini-2.5-flash-preview-04-17``gemini-2.5-flash-preview-04-17` support [extended thinking](https://cloud.google.com/vertex-ai/generative-ai/docs/partner-models/use-claude#claude-3-7-sonnet).
249
+
This is similar to openai thinking, but you get the model's reasoning as it processes the request as well.
250
+
251
+
Note that you will have to set [`strict_open_ai_compliance=False`](/product/ai-gateway/strict-open-ai-compliance) in the headers to use this feature.
252
+
253
+
### Single turn conversation
243
254
<CodeGroup>
244
255
```py Python
245
256
from portkey_ai import Portkey
@@ -273,6 +284,16 @@ If you mix regular tools with grounding tools, vertex might throw an error sayin
273
284
]
274
285
)
275
286
print(response)
287
+
# in case of streaming responses you'd have to parse the response_chunk.choices[0].delta.content_blocks array
The assistants thinking response is returned in the `response_chunk.choices[0].delta.content_blocks` array, not the `response.choices[0].message.content` string.
267
267
268
-
Gemini models do no return their chain-of-thought-messages, so content_blocks are not required for Gemini models.
268
+
Gemini models do not support plugging back the reasoning into multi turn conversations, so you don't need to send the thinking message back to the model.
269
269
</Note>
270
270
271
271
Models like `google.gemini-2.5-flash-preview-04-17``anthropic.claude-3-7-sonnet@20250219` support [extended thinking](https://cloud.google.com/vertex-ai/generative-ai/docs/partner-models/use-claude#claude-3-7-sonnet).
0 commit comments