MicrosoftDocs
diff --git a/‎articles/ai-studio/how-to/deploy-models-phi-3.md
Lines changed: 3 additions & 3 deletions b/‎articles/ai-studio/how-to/deploy-models-phi-3.md
Lines changed: 3 additions & 3 deletions
diff --git a/‎articles/ai-studio/how-to/deploy-models-timegen-1.md
Lines changed: 10 additions & 10 deletions b/‎articles/ai-studio/how-to/deploy-models-timegen-1.md
Lines changed: 10 additions & 10 deletions
diff --git a/‎articles/machine-learning/how-to-deploy-models-cohere-command.md
Lines changed: 3 additions & 2 deletions b/‎articles/machine-learning/how-to-deploy-models-cohere-command.md
Lines changed: 3 additions & 2 deletions
diff --git a/‎articles/machine-learning/how-to-deploy-models-cohere-embed.md
Lines changed: 3 additions & 2 deletions b/‎articles/machine-learning/how-to-deploy-models-cohere-embed.md
Lines changed: 3 additions & 2 deletions
diff --git a/‎articles/machine-learning/how-to-deploy-models-mistral.md
Lines changed: 6 additions & 6 deletions b/‎articles/machine-learning/how-to-deploy-models-mistral.md
Lines changed: 6 additions & 6 deletions
@@ -52,7 +52,7 @@ Certain models in the model catalog can be deployed as a serverless API with pay
 - An [Azure AI Studio hub](../how-to/create-azure-ai-resource.md).
 
     > [!IMPORTANT]
-    > For Phi-3 family models, the pay-as-you-go model deployment offering is only available with hubs created in **East US 2** and **Sweden Central** regions.
+    > For Phi-3 family models, the serverless API model deployment offering is only available with hubs created in **East US 2** and **Sweden Central** regions.
 
 - An [Azure AI Studio project](../how-to/create-projects.md).
 - Azure role-based access controls (Azure RBAC) are used to grant access to operations in Azure AI Studio. To perform the steps in this article, your user account must be assigned the __Azure AI Developer role__ on the resource group. For more information on permissions, see [Role-based access control in Azure AI Studio](../concepts/rbac-ai-studio.md).
@@ -96,7 +96,7 @@ To create a deployment:
 
 ### Consume Phi-3  models as a service
 
-Models deployed as a service can be consumed using the chat API, depending on the type of model you deployed.
+Models deployed as serverless APIs can be consumed using the chat API, depending on the type of model you deployed.
 
 1. From your **Project overview** page, go to the left sidebar and select **Components** > **Deployments**.
 
@@ -108,7 +108,7 @@ Models deployed as a service can be consumed using the chat API, depending on th
 
 ## Cost and quotas
 
-### Cost and quota considerations for Phi-3 models deployed as a service
+### Cost and quota considerations for Phi-3 models deployed as serverless APIs
 
 You can find the pricing information on the **Pricing and terms** tab of the deployment wizard when deploying the model. 
 
 
@@ -18,7 +18,7 @@ ms.custom: [references_regions]
 
 [!INCLUDE [Feature preview](../includes/feature-preview.md)]
 
-In this article, you learn how to use Azure AI Studio to deploy the TimeGEN-1 model as a service with pay-as you go billing.
+In this article, you learn how to use Azure AI Studio to deploy the TimeGEN-1 model as a serverless API with pay-as-you-go billing.
 You filter on the Nixtla collection to browse the TimeGEN-1 model in the [Model Catalog](model-catalog.md).
 
 The Nixtla TimeGEN-1 is a generative, pretrained forecasting and anomaly detection model for time series data. TimeGEN-1 can produce accurate forecasts for new time series without training, using only historical values and exogenous covariates as inputs.
@@ -27,15 +27,15 @@ The Nixtla TimeGEN-1 is a generative, pretrained forecasting and anomaly detecti
 
 Certain models in the model catalog can be deployed as a serverless API with pay-as-you-go billing. This kind of deployment provides a way to consume models as an API without hosting them on your subscription, while keeping the enterprise security and compliance that organizations need. This deployment option doesn't require quota from your subscription.
 
-You can deploy TimeGEN-1 as a service with pay-as-you-go. Nixtla offers TimeGEN-1 through the Microsoft Azure Marketplace. Nixtla can change or update the terms of use and pricing of this model.
+You can deploy TimeGEN-1 as a serverless API with pay-as-you-go billing. Nixtla offers TimeGEN-1 through the Microsoft Azure Marketplace. Nixtla can change or update the terms of use and pricing of this model.
 
 ### Prerequisites
 
 - An Azure subscription with a valid payment method. Free or trial Azure subscriptions don't work. If you don't have an Azure subscription, create a [paid Azure account](https://azure.microsoft.com/pricing/purchase-options/pay-as-you-go) to begin.
 - An [AI Studio hub](../how-to/create-azure-ai-resource.md).
 
   > [!IMPORTANT]
-  > The pay-as-you-go model deployment offering for TimeGEN1 is only available with hubs created in the **East US 2** or **Sweden Central** regions.
+  > The serverless API model deployment offering for TimeGEN-1 is only available with hubs created in the **East US 2** or **Sweden Central** regions.
 
 - An [Azure AI Studio project](../how-to/create-projects.md).
 - Azure role-based access controls (Azure RBAC) are used to grant access to operations in Azure AI Studio. To perform the steps in this article, your user account must be assigned the __Azure AI Developer role__ on the resource group. For more information on permissions, visit [Role-based access control in Azure AI Studio](../concepts/rbac-ai-studio.md).
@@ -60,10 +60,10 @@ These steps demonstrate the deployment of TimeGEN-1. To create a deployment:
 1. Once you subscribe the project for the particular Azure Marketplace offering, subsequent deployments of the _same_ offering in the _same_ project don't require subscribing again. If this scenario applies to you,  there's a **Continue to deploy** option to select.
 1. Give the deployment a name. This name becomes part of the deployment API URL. This URL must be unique in each Azure region.
 1. Select **Deploy**. Wait until the deployment is ready and you're redirected to the Deployments page.
-1. Return to the Deployments page, select the deployment, and note the endpoint's **Target** URL and the Secret **Key**. For more information on using the APIs, see the [reference](#reference-for-timegen-1-deployed-as-a-service) section.
+1. Return to the Deployments page, select the deployment, and note the endpoint's **Target** URL and the Secret **Key**. For more information on using the APIs, see the [reference](#reference-for-timegen-1-deployed-as-a-serverless-api) section.
 1. You can always find the endpoint's details, URL, and access keys by navigating to your **Project overview** page. Then, from the left sidebar of your project, select **Components** > **Deployments**.
 
-To learn about billing for the TimeGEN-1 model deployed as a serverless API with pay-as-you-go token-based billing, see [Cost and quota considerations for the TimeGEN-1 family of models deployed as a service](#cost-and-quota-considerations-for-timegen-1-deployed-as-a-service).
+To learn about billing for the TimeGEN-1 model deployed as a serverless API with pay-as-you-go token-based billing, see [Cost and quota considerations for the TimeGEN-1 family of models deployed as a service](#cost-and-quota-considerations-for-timegen-1-deployed-as-a-serverless-api).
 
 ### Consume the TimeGEN-1 model as a service
 
@@ -82,12 +82,12 @@ You can consume TimeGEN-1 models by using the forecast API.
 |Quick Start Forecast|The Nixtla TimeGEN1 is a generative, pretrained forecasting model for time series data. TimeGEN1 can produce accurate forecasts for new time series without training, using only historical values as inputs.|[Quick Start Forecast](https://aka.ms/quick-start-forecasting)|
 |Fine-tuning|Fine-tuning is a powerful process to utilize Time-GEN1 more effectively. Foundation models - for example, TimeGEN1 - are pretrained on vast amounts of data, to capture wide-ranging features and patterns. These models can then be specialized for specific contexts or domains. Fine-tuning refines the model parameters to forecast a new task, allowing it to tailor its vast pre-existing knowledge towards the requirements of the new data. In this way, fine-tuning serves as a crucial bridge, linking the broad TimeGEN1 capabilities to the specifics of your tasks. Concretely, the fine-tuning process involves performing some training iterations on your input data, to minimize the forecasting error. The forecasts are produced with the updated model. To control the number of iterations, use the finetune_steps argument of the forecast method.|[Fine-tuning](https://aka.ms/finetuning-TimeGEN1)|
 |Anomaly Detection|Anomaly detection in time series data is important across various industries - for example, finance and healthcare. It involves monitoring ordered data points to spot irregularities that might signal issues or threats. Organizations can then swiftly act to prevent, improve, or safeguard their operations.|[Anomaly Detection](https://aka.ms/anomaly-detection)|
-|Exogenous Variables|Exogenous variables are external factors that can influence forecasts. These variables take on one of a limited, fixed number of possible values, and induce a grouping of your observations. For example, if you’re forecasting daily product demand for a retailer, you could benefit from an event variable that may tell you what kind of event takes place on a given day, for example ‘None’, Sporting’, or ‘Cultural’. Or you might also include external factors such as weather.|[Exogenous Variables](https://aka.ms/exogenous-variables)|
+|Exogenous Variables|Exogenous variables are external factors that can influence forecasts. These variables take on one of a limited, fixed number of possible values, and induce a grouping of your observations. For example, if you're forecasting daily product demand for a retailer, you could benefit from an event variable that may tell you what kind of event takes place on a given day, for example 'None', Sporting', or 'Cultural'. Or you might also include external factors such as weather.|[Exogenous Variables](https://aka.ms/exogenous-variables)|
 |Demand Forecasting|Demand forecasting involves application of historical data and other analytical information, to build models that help predict future estimates of customer demand, for specific products, over a specific time period. It helps shape product road map, inventory production, and inventory allocation, among other things.|[Demand Forecasting](https://aka.ms/demand-forecasting-with-TimeGEN1)|
 
-For more information about use of the APIs, visit the [reference](#reference-for-timegen-1-deployed-as-a-service) section.
+For more information about use of the APIs, visit the [reference](#reference-for-timegen-1-deployed-as-a-serverless-api) section.
 
-### Reference for TimeGEN-1 deployed as a service
+### Reference for TimeGEN-1 deployed as a serverless API
 
 #### Forecast API
 
@@ -229,9 +229,9 @@ This JSON sample is an example response:
 
 ## Cost and quotas
 
-### Cost and quota considerations for TimeGEN-1 deployed as a service
+### Cost and quota considerations for TimeGEN-1 deployed as a serverless API
 
-Nixtla offers TimeGEN-1 deployed as a service through the Azure Marketplace. TimeGEN-1 is integrated with Azure AI Studio for use. You can find more information about Azure Marketplace pricing when you deploy the model.
+Nixtla offers TimeGEN-1 deployed as a serverless API through the Azure Marketplace. TimeGEN-1 is integrated with Azure AI Studio for use. You can find more information about Azure Marketplace pricing when you deploy the model.
 
 Each time a project subscribes to a given offer from the Azure Marketplace, a new resource is created to track the costs associated with its consumption. The same resource is used to track costs associated with inference; however, multiple meters are available to track each scenario independently.
 
 
@@ -8,6 +8,7 @@ ms.subservice: inferencing
 ms.topic: how-to
 ms.date: 04/02/2024
 ms.reviewer: mopeakande
+reviewer: msakande
 ms.author: shubhiraj
 author: shubhirajMsft 
 ms.custom: [references_regions]
@@ -101,7 +102,7 @@ To create a deployment:
 
 1. Once you subscribe the workspace for the particular Azure Marketplace offering, subsequent deployments of the _same_ offering in the _same_ workspace don't require subscribing again. If this scenario applies to you, there's a **Continue to deploy** option to select.
 
-    :::image type="content" source="media/how-to-deploy-models-cohere-command/command-r-existing-subscription.png" alt-text="A screenshot showing a project that is already subscribed to the offering." lightbox="media/how-to-deploy-models-cohere-command/command-r-existing-subscription.png":::
+    :::image type="content" source="media/how-to-deploy-models-cohere-command/command-r-existing-subscription.png" alt-text="A screenshot showing a workspace that is already subscribed to the offering." lightbox="media/how-to-deploy-models-cohere-command/command-r-existing-subscription.png":::
 
 1. Give the deployment a name. This name becomes part of the deployment API URL. This URL must be unique in each Azure region.
 
@@ -816,7 +817,7 @@ Each time a workspace subscribes to a given model offering from Azure Marketplac
 
 For more information on how to track costs, see [Monitor costs for models offered through the Azure Marketplace](../ai-studio/how-to/costs-plan-manage.md#monitor-costs-for-models-offered-through-the-azure-marketplace).
 
-Quota is managed per deployment. Each deployment has a rate limit of 200,000 tokens per minute and 1,000 API requests per minute. However, we currently limit one deployment per model per project. Contact Microsoft Azure Support if the current rate limits aren't sufficient for your scenarios.
+Quota is managed per deployment. Each deployment has a rate limit of 200,000 tokens per minute and 1,000 API requests per minute. However, we currently limit one deployment per model per workspace. Contact Microsoft Azure Support if the current rate limits aren't sufficient for your scenarios.
 
 ## Content filtering
 
 
@@ -8,6 +8,7 @@ ms.subservice: inferencing
 ms.topic: how-to
 ms.date: 04/02/2024
 ms.reviewer: mopeakande
+reviewer: msakande
 ms.author: shubhiraj
 author: shubhirajMsft
 ms.custom: [references_regions]
@@ -81,7 +82,7 @@ To create a deployment:
 
 1. Once you subscribe the workspace for the particular Azure Marketplace offering, subsequent deployments of the _same_ offering in the _same_ workspace don't require subscribing again. If this scenario applies to you, there's a **Continue to deploy** option to select.
 
-    :::image type="content" source="media/how-to-deploy-models-cohere-embed/embed-english-existing-deployment.png" alt-text="A screenshot showing a project that is already subscribed to the offering." lightbox="media/how-to-deploy-models-cohere-embed/embed-english-existing-deployment.png":::
+    :::image type="content" source="media/how-to-deploy-models-cohere-embed/embed-english-existing-deployment.png" alt-text="A screenshot showing a workspace that is already subscribed to the offering." lightbox="media/how-to-deploy-models-cohere-embed/embed-english-existing-deployment.png":::
 
 1. Give the deployment a name. This name becomes part of the deployment API URL. This URL must be unique in each Azure region.
 
@@ -350,7 +351,7 @@ Each time a workspace subscribes to a given model offering from Azure Marketplac
 
 For more information on how to track costs, see [Monitor costs for models offered through the Azure Marketplace](../ai-studio/how-to/costs-plan-manage.md#monitor-costs-for-models-offered-through-the-azure-marketplace).
 
-Quota is managed per deployment. Each deployment has a rate limit of 200,000 tokens per minute and 1,000 API requests per minute. However, we currently limit one deployment per model per project. Contact Microsoft Azure Support if the current rate limits aren't sufficient for your scenarios.
+Quota is managed per deployment. Each deployment has a rate limit of 200,000 tokens per minute and 1,000 API requests per minute. However, we currently limit one deployment per model per workspace. Contact Microsoft Azure Support if the current rate limits aren't sufficient for your scenarios.
 
 ## Content filtering
 
 
@@ -7,10 +7,10 @@ ms.service: machine-learning
 ms.subservice: inferencing
 ms.topic: how-to
 ms.date: 04/29/2024
-mms.author: kritifaujdar
-.author: fkriti
-ms.author: mopeakande
-author: msakande
+ms.author: kritifaujdar
+author: fkriti
+ms.reviewer: mopeakande
+reviewer: msakande
 ms.custom: [references_regions]
 
 #This functionality is also available in Azure AI Studio: /azure/ai-studio/how-to/deploy-models-mistral.md
@@ -95,7 +95,7 @@ To create a deployment:
 
 1. Once you subscribe the workspace for the particular Azure Marketplace offering, subsequent deployments of the _same_ offering in the _same_ workspace don't require subscribing again. If this scenario applies to you, you'll see a **Continue to deploy** option to select.
 
-    :::image type="content" source="media/how-to-deploy-models-mistral/mistral-deploy-pay-as-you-go-project.png" alt-text="A screenshot showing a project that is already subscribed to the offering." lightbox="media/how-to-deploy-models-mistral/mistral-deploy-pay-as-you-go-project.png":::
+    :::image type="content" source="media/how-to-deploy-models-mistral/mistral-deploy-pay-as-you-go-project.png" alt-text="A screenshot showing a workspace that is already subscribed to the offering." lightbox="media/how-to-deploy-models-mistral/mistral-deploy-pay-as-you-go-project.png":::
 
 1. Give the deployment a name. This name becomes part of the deployment API URL. This URL must be unique in each Azure region.
 
@@ -271,7 +271,7 @@ Each time a workspace subscribes to a given model offering from Azure Marketplac
 
 For more information on how to track costs, see [Monitor costs for models offered through the Azure Marketplace](../ai-studio/how-to/costs-plan-manage.md#monitor-costs-for-models-offered-through-the-azure-marketplace).
 
-Quota is managed per deployment. Each deployment has a rate limit of 200,000 tokens per minute and 1,000 API requests per minute. However, we currently limit one deployment per model per project. Contact Microsoft Azure Support if the current rate limits aren't sufficient for your scenarios.
+Quota is managed per deployment. Each deployment has a rate limit of 200,000 tokens per minute and 1,000 API requests per minute. However, we currently limit one deployment per model per workspace. Contact Microsoft Azure Support if the current rate limits aren't sufficient for your scenarios.
 
 ## Content filtering