draft development

PatrickFarley · PatrickFarley · commit cc27d5ef7925 · 2025-04-18T00:08:15.000-04:00
diff --git a/articles/ai-services/openai/concepts/model-router.md b/articles/ai-services/openai/concepts/model-router.md
@@ -1,5 +1,5 @@
 ---
-title: Azure OpenAI model router concepts
+title: Azure OpenAI model router (preview) concepts
 titleSuffix: Azure OpenAI
 description: Learn about the model router feature in Azure OpenAI Service.
 author: PatrickFarley
@@ -11,16 +11,13 @@ ms.custom:
 manager: nitinme
 ---
 
-# Azure OpenAI model router
+# Azure OpenAI model router (preview)
 
-Azure OpenAI model router is a deployable AI chat model that automatically selects the best underlying chat model to respond to a given prompt. It uses a combination of preexisting models to provide high performance while saving on compute costs where possible.
+Azure OpenAI model router is a deployable AI chat model that is trained to select the best large language model (LLM) to respond to a given prompt in real time. By evaluating factors like query complexity, cost, and performance, it intelligently routes requests to the most suitable model. 
 
 ## Why use model router?
 
-Model router intelligently selects the best underlying model for a given prompt. This way, smaller and cheaper models are used when they're sufficient for the task, but larger and more expensive models are available for more complex tasks. Also, reasoning models are available for tasks that require complex reasoning, but non-reasoning models are used otherwise. Model router provides a single chat experience that combines the best features from all the underlying chat models.
-
-Model router is useful for a variety of applications. TBD
-
+Model router intelligently selects the best underlying model for a given prompt to optimize costs while maintaining quality. Smaller and cheaper models are used when they're sufficient for the task, but larger and more expensive models are available for more complex tasks. Also, reasoning models are available for tasks that require complex reasoning, and non-reasoning models are used otherwise. Model router provides a single chat experience that combines the best features from all of the underlying chat models.
 
 ## Versioning 
 
@@ -47,4 +44,9 @@ Global Standard region support.
 
 When you use Azure OpenAI model router, you are only billed for the use of the underlying models as they're recruited to respond to prompts. The model router itself doesn't incur any extra charges.
 
-You can monitor the overall costs of your model router deployment in the Azure portal. TBD
+You can monitor the overall costs of your model router deployment in the Azure portal. TBD
+
+## Next step
+
+> [!DIV class="nextstepaction"]
+> [How to use model router](../how-to/model-router.md)
diff --git a/articles/ai-services/openai/how-to/model-router.md b/articles/ai-services/openai/how-to/model-router.md
@@ -1,5 +1,5 @@
 ---
-title: How to use model router in Azure OpenAI Service
+title: How to use model router (preview) in Azure OpenAI Service
 titleSuffix: Azure OpenAI Service
 description: Learn how to use the model router in Azure OpenAI Service to select the best model for your task.
 author: PatrickFarley
@@ -11,9 +11,9 @@ ms.date: 04/17/2025
 manager: nitinme
 ---
 
-# Use model router
+# Use Azure OpenAI model router (preview)
 
-Azure OpenAI model router is a deployable AI chat model that automatically selects the best underlying chat model to respond to a given prompt. It uses a combination of preexisting models to provide high performance while saving on compute costs where possible. For more information on how model router works and its advantages and limitations, see the [Model router concepts guide](../concepts/model-router.md).
+Azure OpenAI model router is a deployable AI chat model that is trained to select the best large language model (LLM) to respond to a given prompt in real time. It uses a combination of preexisting models to provide high performance while saving on compute costs where possible. For more information on how model router works and its advantages and limitations, see the [Model router concepts guide](../concepts/model-router.md).
 
 You can access model router through the Completions API just as you would use a single base model like GPT-4.
 
@@ -27,19 +27,19 @@ Model router is packaged as a single OpenAI model that you deploy. Follow the st
 > - You select a content filter when you deploy the model router model (or you can apply a filter later). The content filter is applied to all activity to and from the model router: you don't set content filters for each of the underlying chat models.
 > - Your tokens-per-minute rate limit setting is applied to all activity to and from the model router: you don't set rate limits for each of the underlying chat models.
 
-## Use model router in chat
+## Use model router in chats
 
-### REST API
+You can use model router through the [chat completions API](/azure/ai-services/openai/chatgpt-quickstart) in the same way you'd use other OpenAI chat models. Set the `model` parameter to the name of our model router deployment, and set the `messages` parameter to the messages you want to send to the model.
+
+In the [Azure AI Foundry portal](https://ai.azure.com/), you can navigate to your model router deployment on the **Models + endpoints** page and select it to enter the model playground. In the playground experience, you can enter messages and see the model's responses. Each response message will show which underlying model was selected to respond.
 
-> [!IMPORTANT]
-> Set temperature and top_p to the values you prefer, but note that reasoning models (o-series) don't support these parameters. If model router selects a reasoning model for your prompt, it will drop the temperature and top_p input parameters. TBD
 
 > [!IMPORTANT]
-> The reasoning parameter is not supported in model router. If model router selects a reasoning model for your prompt, it will also select a value for the reasoning parameter based on TBD
+> You can set the `Temperature` and `Top_P` parameters to the values you prefer (see the [concepts guide](/azure/ai-services/openai/concepts/prompt-engineering?tabs=chat#temperature-and-top_p-parameters)), but note that reasoning models (o-series) don't support these parameters. If model router selects a reasoning model for your prompt, it will ignore the `Temperature` and `Top_P` input parameters.
 
-### Portal
+> [!IMPORTANT]
+> The `reasoning_effort` parameter (see the [Reasoning models guide](/azure/ai-services/openai/how-to/reasoning?tabs=python-secure#reasoning-effort)) is not supported in model router. If the model router selects a reasoning model for your prompt, it will also select a `reasoning_effort` input value based on the complexity of the prompt.
 
-Playground .  in the conversation it displays the underlying model version of each response TBD
 
 ## Evaluate model router performance
 
@@ -50,4 +50,24 @@ We provide custom metric test via notebooks.
 in azmon, you can monitor all the standard metrics
 cost analysis page (only in the azure portal). it will show the costs of each underlying model. (no extra charges for model router, above the underlying charges)
 
-## Troubleshooting
+See the [Evaluations guide](/azure/ai-services/openai/how-to/evaluations?tabs=question-eval-input).
+
+## Monitor model router metrics
+
+### Monitor performance
+
+You can monitor the performance of your model router deployment in Azure monitor (AzMon) in the Azure portal. 
+
+To view AzMon metrics for router: 
+1. Filter by deployment name of model router.
+1. Optionally, split up the metrics by underlying models.
+
+The following metrics are available:
+
+### Monitor costs
+
+You can monitor the costs of model router, which is the sum of the costs incurred by the underlying models.
+1. Visit the **Cost analysis** page in the Azure portal.
+1. If needed, filter by Azure resource.
+1. Then, filter by deployment name: Filter by billing "Tag", select **Deployment** as the name of the tag, and then select your model router deployment name as the value.
+