Skip to content

Commit ff85c89

Browse files
committed
add more param info
1 parent 7e9be09 commit ff85c89

File tree

3 files changed

+12
-2
lines changed

3 files changed

+12
-2
lines changed

articles/ai-services/openai/concepts/model-router.md

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,13 @@ If you select **Auto-update** at the deployment step (see [Manage models](/azure
3434

3535
## Limitations
3636

37-
See [Quotas and limits](/azure/ai-services/openai/quotas-limits).
37+
See [Quotas and limits](/azure/ai-services/openai/quotas-limits) for rate limit information.
38+
39+
The context window limit listed on the [Models](../concepts/models.md) page is the limit of the smallest underlying model. Other underlying models are compatible with larger context windows, which means an API call with a larger context will succeed only if the prompt happens to be routed to the right model, otherwise the call will fail. To shorten the context window, you can do one of the following:
40+
- Summarize the prompt before passing it to the model
41+
- Truncate the prompt into more relevant parts
42+
- Use document embeddings and have the chat model retrieve relevant sections: see [Azure AI Search](/azure/search/search-what-is-azure-search)
43+
3844

3945
Model router doesn't process input images or audio.
4046

articles/ai-services/openai/concepts/models.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -62,7 +62,9 @@ A model that intelligently selects from a set of underlying chat models to respo
6262

6363
| Model ID | Description | Context Window | Max Output Tokens | Training Data (up to) |
6464
| --- | :--- |:--- |:---|:---: |
65-
| `model-router` (2025-04-15) | A model that intelligently selects from a set of underlying chat models to respond to a given prompt. | 128,000 | 32768 (GPT 4.1 series)</br> 100 K (o4-mini) | May 31, 2024 |
65+
| `model-router` (2025-04-15) | A model that intelligently selects from a set of underlying chat models to respond to a given prompt. | 128,000* | 32768 (GPT 4.1 series)</br> 100 K (o4-mini) | May 31, 2024 |
66+
67+
*Larger context windows are compatible with _some_ of the underlying models, which means an API call with a larger context will succeed only if the prompt happens to be routed to the right model, otherwise the call will fail.
6668

6769
## computer-use-preview
6870

articles/ai-services/openai/how-to/model-router.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,8 @@ In the [Azure AI Foundry portal](https://ai.azure.com/), you can navigate to you
3636

3737
> [!IMPORTANT]
3838
> You can set the `Temperature` and `Top_P` parameters to the values you prefer (see the [concepts guide](/azure/ai-services/openai/concepts/prompt-engineering?tabs=chat#temperature-and-top_p-parameters)), but note that reasoning models (o-series) don't support these parameters. If model router selects a reasoning model for your prompt, it ignores the `Temperature` and `Top_P` input parameters.
39+
>
40+
> The parameters `stop`, `presence_penalty`, `frequency_penalty`, `logit_bias`, and `logprobs` are similarly dropped for o-series models but used otherwise.
3941
4042
> [!IMPORTANT]
4143
> The `reasoning_effort` parameter (see the [Reasoning models guide](/azure/ai-services/openai/how-to/reasoning?tabs=python-secure#reasoning-effort)) isn't supported in model router. If the model router selects a reasoning model for your prompt, it also selects a `reasoning_effort` input value based on the complexity of the prompt.

0 commit comments

Comments
 (0)