You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/ai-services/openai/concepts/model-router.md
+10-8Lines changed: 10 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,5 +1,5 @@
1
1
---
2
-
title: Azure OpenAI model router concepts
2
+
title: Azure OpenAI model router (preview) concepts
3
3
titleSuffix: Azure OpenAI
4
4
description: Learn about the model router feature in Azure OpenAI Service.
5
5
author: PatrickFarley
@@ -11,16 +11,13 @@ ms.custom:
11
11
manager: nitinme
12
12
---
13
13
14
-
# Azure OpenAI model router
14
+
# Azure OpenAI model router (preview)
15
15
16
-
Azure OpenAI model router is a deployable AI chat model that automatically selects the best underlying chat model to respond to a given prompt. It uses a combination of preexisting models to provide high performance while saving on compute costs where possible.
16
+
Azure OpenAI model router is a deployable AI chat model that is trained to select the best large language model (LLM) to respond to a given prompt in real time. By evaluating factors like query complexity, cost, and performance, it intelligently routes requests to the most suitable model.
17
17
18
18
## Why use model router?
19
19
20
-
Model router intelligently selects the best underlying model for a given prompt. This way, smaller and cheaper models are used when they're sufficient for the task, but larger and more expensive models are available for more complex tasks. Also, reasoning models are available for tasks that require complex reasoning, but non-reasoning models are used otherwise. Model router provides a single chat experience that combines the best features from all the underlying chat models.
21
-
22
-
Model router is useful for a variety of applications. TBD
23
-
20
+
Model router intelligently selects the best underlying model for a given prompt to optimize costs while maintaining quality. Smaller and cheaper models are used when they're sufficient for the task, but larger and more expensive models are available for more complex tasks. Also, reasoning models are available for tasks that require complex reasoning, and non-reasoning models are used otherwise. Model router provides a single chat experience that combines the best features from all of the underlying chat models.
24
21
25
22
## Versioning
26
23
@@ -47,4 +44,9 @@ Global Standard region support.
47
44
48
45
When you use Azure OpenAI model router, you are only billed for the use of the underlying models as they're recruited to respond to prompts. The model router itself doesn't incur any extra charges.
49
46
50
-
You can monitor the overall costs of your model router deployment in the Azure portal. TBD
47
+
You can monitor the overall costs of your model router deployment in the Azure portal. TBD
48
+
49
+
## Next step
50
+
51
+
> [!DIV class="nextstepaction"]
52
+
> [How to use model router](../how-to/model-router.md)
Copy file name to clipboardExpand all lines: articles/ai-services/openai/how-to/model-router.md
+31-11Lines changed: 31 additions & 11 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,5 +1,5 @@
1
1
---
2
-
title: How to use model router in Azure OpenAI Service
2
+
title: How to use model router (preview) in Azure OpenAI Service
3
3
titleSuffix: Azure OpenAI Service
4
4
description: Learn how to use the model router in Azure OpenAI Service to select the best model for your task.
5
5
author: PatrickFarley
@@ -11,9 +11,9 @@ ms.date: 04/17/2025
11
11
manager: nitinme
12
12
---
13
13
14
-
# Use model router
14
+
# Use Azure OpenAI model router (preview)
15
15
16
-
Azure OpenAI model router is a deployable AI chat model that automatically selects the best underlying chat model to respond to a given prompt. It uses a combination of preexisting models to provide high performance while saving on compute costs where possible. For more information on how model router works and its advantages and limitations, see the [Model router concepts guide](../concepts/model-router.md).
16
+
Azure OpenAI model router is a deployable AI chat model that is trained to select the best large language model (LLM) to respond to a given prompt in real time. It uses a combination of preexisting models to provide high performance while saving on compute costs where possible. For more information on how model router works and its advantages and limitations, see the [Model router concepts guide](../concepts/model-router.md).
17
17
18
18
You can access model router through the Completions API just as you would use a single base model like GPT-4.
19
19
@@ -27,19 +27,19 @@ Model router is packaged as a single OpenAI model that you deploy. Follow the st
27
27
> - You select a content filter when you deploy the model router model (or you can apply a filter later). The content filter is applied to all activity to and from the model router: you don't set content filters for each of the underlying chat models.
28
28
> - Your tokens-per-minute rate limit setting is applied to all activity to and from the model router: you don't set rate limits for each of the underlying chat models.
29
29
30
-
## Use model router in chat
30
+
## Use model router in chats
31
31
32
-
### REST API
32
+
You can use model router through the [chat completions API](/azure/ai-services/openai/chatgpt-quickstart) in the same way you'd use other OpenAI chat models. Set the `model` parameter to the name of our model router deployment, and set the `messages` parameter to the messages you want to send to the model.
33
+
34
+
In the [Azure AI Foundry portal](https://ai.azure.com/), you can navigate to your model router deployment on the **Models + endpoints** page and select it to enter the model playground. In the playground experience, you can enter messages and see the model's responses. Each response message will show which underlying model was selected to respond.
33
35
34
-
> [!IMPORTANT]
35
-
> Set temperature and top_p to the values you prefer, but note that reasoning models (o-series) don't support these parameters. If model router selects a reasoning model for your prompt, it will drop the temperature and top_p input parameters. TBD
36
36
37
37
> [!IMPORTANT]
38
-
> The reasoning parameter is not supported in model router. If model router selects a reasoning model for your prompt, it will also select a value for the reasoning parameter based on TBD
38
+
> You can set the `Temperature` and `Top_P` parameters to the values you prefer (see the [concepts guide](/azure/ai-services/openai/concepts/prompt-engineering?tabs=chat#temperature-and-top_p-parameters)), but note that reasoning models (o-series) don't support these parameters. If model router selects a reasoning model for your prompt, it will ignore the `Temperature` and `Top_P` input parameters.
39
39
40
-
### Portal
40
+
> [!IMPORTANT]
41
+
> The `reasoning_effort` parameter (see the [Reasoning models guide](/azure/ai-services/openai/how-to/reasoning?tabs=python-secure#reasoning-effort)) is not supported in model router. If the model router selects a reasoning model for your prompt, it will also select a `reasoning_effort` input value based on the complexity of the prompt.
41
42
42
-
Playground . in the conversation it displays the underlying model version of each response TBD
43
43
44
44
## Evaluate model router performance
45
45
@@ -50,4 +50,24 @@ We provide custom metric test via notebooks.
50
50
in azmon, you can monitor all the standard metrics
51
51
cost analysis page (only in the azure portal). it will show the costs of each underlying model. (no extra charges for model router, above the underlying charges)
52
52
53
-
## Troubleshooting
53
+
See the [Evaluations guide](/azure/ai-services/openai/how-to/evaluations?tabs=question-eval-input).
54
+
55
+
## Monitor model router metrics
56
+
57
+
### Monitor performance
58
+
59
+
You can monitor the performance of your model router deployment in Azure monitor (AzMon) in the Azure portal.
60
+
61
+
To view AzMon metrics for router:
62
+
1. Filter by deployment name of model router.
63
+
1. Optionally, split up the metrics by underlying models.
64
+
65
+
The following metrics are available:
66
+
67
+
### Monitor costs
68
+
69
+
You can monitor the costs of model router, which is the sum of the costs incurred by the underlying models.
70
+
1. Visit the **Cost analysis** page in the Azure portal.
71
+
1. If needed, filter by Azure resource.
72
+
1. Then, filter by deployment name: Filter by billing "Tag", select **Deployment** as the name of the tag, and then select your model router deployment name as the value.
0 commit comments