You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/cody/enterprise/model-configuration.mdx
+4-1Lines changed: 4 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -226,6 +226,9 @@ This field is an array of items, each with the following fields:
226
226
-`maxInputTokens`: Specifies the maximum number of tokens for the contextual data in the prompt (e.g., question, relevant snippets)
227
227
-`maxOutputTokens`: Specifies the maximum number of tokens allowed in the response
228
228
-`reasoningEffort`: Specifies the effort on reasoning for reasoning models (having `reasoning` capability). Supported values: `high`, `medium`, `low`.
229
+
How this value is treated depends on the specific provider.
230
+
For example, for Anthropic models supporting thinking, `low` effort means that the minimum [`thinking.budget_tokens`](https://docs.anthropic.com/en/api/messages#body-thinking) value (1024) will be used. For other `reasoningEffort` values, the `contextWindow.maxOutputTokens / 2` value will be used.
231
+
For OpenAI reasoning models, the `reasoningEffort` field value corresponds to the [`reasoning_effort`](https://platform.openai.com/docs/api-reference/chat/create#chat-create-reasoning_effort) request body value.
229
232
-`serverSideConfig`: Additional configuration for the model. It can be one of the following:
230
233
231
234
-`awsBedrockProvisionedThroughput`: Specifies provisioned throughput settings for AWS Bedrock models with the following fields:
@@ -327,7 +330,7 @@ In this modelOverrides config example:
327
330
- The model is configured to use the `"chat"` and `"reasoning"` capabilities
328
331
- The `reasoningEffort` can be set to 3 different options in the Model Config. These options are `high`, `medium` and `low`
329
332
- The default `reasoningEffort` is set to `low`
330
-
-When the reasoning effort is `low`, 1024 tokens is used as the thinking budget. With `medium` and `high` the thinking budget is set via `max_tokens_to_sample/2`
333
+
-For Anthropic models supporting thinking, when the reasoning effort is `low`, 1024 tokens is used as the thinking budget. With `medium` and `high` the thinking budget is set to half of the maxOutputTokens value
331
334
332
335
Refer to the [examples page](/cody/enterprise/model-config-examples) for additional examples.
0 commit comments