docs: update reasoning effort documentation in model configuration

taras-yemets · taras-yemets · commit 7a4ce13d46b2 · 2025-04-29T23:10:35.000+03:00
diff --git a/docs/cody/enterprise/model-configuration.mdx b/docs/cody/enterprise/model-configuration.mdx
@@ -226,6 +226,9 @@ This field is an array of items, each with the following fields:
     -   `maxInputTokens`: Specifies the maximum number of tokens for the contextual data in the prompt (e.g., question, relevant snippets)
     -   `maxOutputTokens`: Specifies the maximum number of tokens allowed in the response
 -   `reasoningEffort`: Specifies the effort on reasoning for reasoning models (having `reasoning` capability). Supported values: `high`, `medium`, `low`.
+How this value is treated depends on the specific provider.
+For example, for Anthropic models supporting thinking, `low` effort means that the minimum [`thinking.budget_tokens`](https://docs.anthropic.com/en/api/messages#body-thinking) value (1024) will be used. For other `reasoningEffort` values, the `contextWindow.maxOutputTokens / 2` value will be used.
+For OpenAI reasoning models, the `reasoningEffort` field value corresponds to the [`reasoning_effort`](https://platform.openai.com/docs/api-reference/chat/create#chat-create-reasoning_effort) request body value.
 -   `serverSideConfig`: Additional configuration for the model. It can be one of the following:
 
     -   `awsBedrockProvisionedThroughput`: Specifies provisioned throughput settings for AWS Bedrock models with the following fields:
@@ -327,7 +330,7 @@ In this modelOverrides config example:
 - The model is configured to use the `"chat"` and `"reasoning"` capabilities
 - The `reasoningEffort` can be set to 3 different options in the Model Config. These options are `high`, `medium` and `low`
 - The default `reasoningEffort` is set to `low`
-- When the reasoning effort is `low`, 1024 tokens is used as the thinking budget. With `medium` and `high` the thinking budget is set via `max_tokens_to_sample/2`
+- For Anthropic models supporting thinking, when the reasoning effort is `low`, 1024 tokens is used as the thinking budget. With `medium` and `high` the thinking budget is set to half of the maxOutputTokens value
 
 Refer to the [examples page](/cody/enterprise/model-config-examples) for additional examples.