draft development

PatrickFarley · PatrickFarley · commit c17d6ea161ca · 2025-04-17T16:23:00.000-04:00
diff --git a/articles/ai-services/openai/concepts/model-router.md b/articles/ai-services/openai/concepts/model-router.md
@@ -37,6 +37,8 @@ If you select **Auto-update** at the deployment step (see [Manage models](/azure
 
 ## Limitations
 
+See quotas and limits
+
 Model router doesn't process input images or audio.
 
 Global Standard region support.
diff --git a/articles/ai-services/openai/how-to/model-router.md b/articles/ai-services/openai/how-to/model-router.md
@@ -22,15 +22,32 @@ You can access model router through the Completions API just as you would use a
 Model router is packaged as a single OpenAI model that you deploy. Follow the steps in the [resource deployment guide](/azure/ai-services/openai/how-to/create-resource), and in the **Create new deployment** step, find `Azure OpenAI model router` in the **Model** list. Select it, and then complete the rest of the deployment steps.
 
 > [!NOTE]
-> You don't need to deploy the underlying models separately. Model router works independently of your other deployed models.
+> Consider that your deployment settings apply to all underlying chat models that model router uses.
+> - You don't need to deploy the underlying chat models separately. Model router works independently of your other deployed models.
+> - You select a content filter when you deploy the model router model (or you can apply a filter later). The content filter is applied to all activity to and from the model router: you don't set content filters for each of the underlying chat models.
+> - Your tokens-per-minute rate limit setting is applied to all activity to and from the model router: you don't set rate limits for each of the underlying chat models.
 
 ## Use model router in chat
 
 ### REST API
 
+> [!IMPORTANT]
+> Set temperature and top_p to the values you prefer, but note that reasoning models (o-series) don't support these parameters. If model router selects a reasoning model for your prompt, it will drop the temperature and top_p input parameters. TBD
 
-### Portal 
+> [!IMPORTANT]
+> The reasoning parameter is not supported in model router. If model router selects a reasoning model for your prompt, it will also select a value for the reasoning parameter based on TBD
+
+### Portal
+
+Playground .  in the conversation it displays the underlying model version of each response TBD
 
 ## Evaluate model router performance
 
+
+you can create a custom metric, and submit a job to compare the router to other models. then in foundry portal you can compare the performances. 
+
+We provide custom metric test via notebooks.
+in azmon, you can monitor all the standard metrics
+cost analysis page (only in the azure portal). it will show the costs of each underlying model. (no extra charges for model router, above the underlying charges)
+
 ## Troubleshooting