Skip to content

Commit c17d6ea

Browse files
committed
draft development
1 parent 2997820 commit c17d6ea

File tree

2 files changed

+21
-2
lines changed

2 files changed

+21
-2
lines changed

articles/ai-services/openai/concepts/model-router.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,8 @@ If you select **Auto-update** at the deployment step (see [Manage models](/azure
3737

3838
## Limitations
3939

40+
See quotas and limits
41+
4042
Model router doesn't process input images or audio.
4143

4244
Global Standard region support.

articles/ai-services/openai/how-to/model-router.md

Lines changed: 19 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -22,15 +22,32 @@ You can access model router through the Completions API just as you would use a
2222
Model router is packaged as a single OpenAI model that you deploy. Follow the steps in the [resource deployment guide](/azure/ai-services/openai/how-to/create-resource), and in the **Create new deployment** step, find `Azure OpenAI model router` in the **Model** list. Select it, and then complete the rest of the deployment steps.
2323

2424
> [!NOTE]
25-
> You don't need to deploy the underlying models separately. Model router works independently of your other deployed models.
25+
> Consider that your deployment settings apply to all underlying chat models that model router uses.
26+
> - You don't need to deploy the underlying chat models separately. Model router works independently of your other deployed models.
27+
> - You select a content filter when you deploy the model router model (or you can apply a filter later). The content filter is applied to all activity to and from the model router: you don't set content filters for each of the underlying chat models.
28+
> - Your tokens-per-minute rate limit setting is applied to all activity to and from the model router: you don't set rate limits for each of the underlying chat models.
2629
2730
## Use model router in chat
2831

2932
### REST API
3033

34+
> [!IMPORTANT]
35+
> Set temperature and top_p to the values you prefer, but note that reasoning models (o-series) don't support these parameters. If model router selects a reasoning model for your prompt, it will drop the temperature and top_p input parameters. TBD
3136
32-
### Portal
37+
> [!IMPORTANT]
38+
> The reasoning parameter is not supported in model router. If model router selects a reasoning model for your prompt, it will also select a value for the reasoning parameter based on TBD
39+
40+
### Portal
41+
42+
Playground . in the conversation it displays the underlying model version of each response TBD
3343

3444
## Evaluate model router performance
3545

46+
47+
you can create a custom metric, and submit a job to compare the router to other models. then in foundry portal you can compare the performances.
48+
49+
We provide custom metric test via notebooks.
50+
in azmon, you can monitor all the standard metrics
51+
cost analysis page (only in the azure portal). it will show the costs of each underlying model. (no extra charges for model router, above the underlying charges)
52+
3653
## Troubleshooting

0 commit comments

Comments
 (0)