Skip to content

Commit 89a683b

Browse files
authored
Merge pull request #275626 from msakande/phi-3-and-timegen-1-AzureML
Phi 3 and timegen 1 azure ml
2 parents d8c467d + d236f4e commit 89a683b

8 files changed

+374
-23
lines changed

articles/ai-studio/how-to/deploy-models-phi-3.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -52,7 +52,7 @@ Certain models in the model catalog can be deployed as a serverless API with pay
5252
- An [Azure AI Studio hub](../how-to/create-azure-ai-resource.md).
5353

5454
> [!IMPORTANT]
55-
> For Phi-3 family models, the pay-as-you-go model deployment offering is only available with hubs created in **East US 2** and **Sweden Central** regions.
55+
> For Phi-3 family models, the serverless API model deployment offering is only available with hubs created in **East US 2** and **Sweden Central** regions.
5656
5757
- An [Azure AI Studio project](../how-to/create-projects.md).
5858
- Azure role-based access controls (Azure RBAC) are used to grant access to operations in Azure AI Studio. To perform the steps in this article, your user account must be assigned the __Azure AI Developer role__ on the resource group. For more information on permissions, see [Role-based access control in Azure AI Studio](../concepts/rbac-ai-studio.md).
@@ -96,7 +96,7 @@ To create a deployment:
9696

9797
### Consume Phi-3 models as a service
9898

99-
Models deployed as a service can be consumed using the chat API, depending on the type of model you deployed.
99+
Models deployed as serverless APIs can be consumed using the chat API, depending on the type of model you deployed.
100100

101101
1. From your **Project overview** page, go to the left sidebar and select **Components** > **Deployments**.
102102

@@ -108,7 +108,7 @@ Models deployed as a service can be consumed using the chat API, depending on th
108108

109109
## Cost and quotas
110110

111-
### Cost and quota considerations for Phi-3 models deployed as a service
111+
### Cost and quota considerations for Phi-3 models deployed as serverless APIs
112112

113113
You can find the pricing information on the **Pricing and terms** tab of the deployment wizard when deploying the model.
114114

articles/ai-studio/how-to/deploy-models-timegen-1.md

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ ms.custom: [references_regions]
1818

1919
[!INCLUDE [Feature preview](../includes/feature-preview.md)]
2020

21-
In this article, you learn how to use Azure AI Studio to deploy the TimeGEN-1 model as a service with pay-as you go billing.
21+
In this article, you learn how to use Azure AI Studio to deploy the TimeGEN-1 model as a serverless API with pay-as-you-go billing.
2222
You filter on the Nixtla collection to browse the TimeGEN-1 model in the [Model Catalog](model-catalog.md).
2323

2424
The Nixtla TimeGEN-1 is a generative, pretrained forecasting and anomaly detection model for time series data. TimeGEN-1 can produce accurate forecasts for new time series without training, using only historical values and exogenous covariates as inputs.
@@ -27,15 +27,15 @@ The Nixtla TimeGEN-1 is a generative, pretrained forecasting and anomaly detecti
2727

2828
Certain models in the model catalog can be deployed as a serverless API with pay-as-you-go billing. This kind of deployment provides a way to consume models as an API without hosting them on your subscription, while keeping the enterprise security and compliance that organizations need. This deployment option doesn't require quota from your subscription.
2929

30-
You can deploy TimeGEN-1 as a service with pay-as-you-go. Nixtla offers TimeGEN-1 through the Microsoft Azure Marketplace. Nixtla can change or update the terms of use and pricing of this model.
30+
You can deploy TimeGEN-1 as a serverless API with pay-as-you-go billing. Nixtla offers TimeGEN-1 through the Microsoft Azure Marketplace. Nixtla can change or update the terms of use and pricing of this model.
3131

3232
### Prerequisites
3333

3434
- An Azure subscription with a valid payment method. Free or trial Azure subscriptions don't work. If you don't have an Azure subscription, create a [paid Azure account](https://azure.microsoft.com/pricing/purchase-options/pay-as-you-go) to begin.
3535
- An [AI Studio hub](../how-to/create-azure-ai-resource.md).
3636

3737
> [!IMPORTANT]
38-
> The pay-as-you-go model deployment offering for TimeGEN1 is only available with hubs created in the **East US 2** or **Sweden Central** regions.
38+
> The serverless API model deployment offering for TimeGEN-1 is only available with hubs created in the **East US 2** or **Sweden Central** regions.
3939
4040
- An [Azure AI Studio project](../how-to/create-projects.md).
4141
- Azure role-based access controls (Azure RBAC) are used to grant access to operations in Azure AI Studio. To perform the steps in this article, your user account must be assigned the __Azure AI Developer role__ on the resource group. For more information on permissions, visit [Role-based access control in Azure AI Studio](../concepts/rbac-ai-studio.md).
@@ -60,10 +60,10 @@ These steps demonstrate the deployment of TimeGEN-1. To create a deployment:
6060
1. Once you subscribe the project for the particular Azure Marketplace offering, subsequent deployments of the _same_ offering in the _same_ project don't require subscribing again. If this scenario applies to you, there's a **Continue to deploy** option to select.
6161
1. Give the deployment a name. This name becomes part of the deployment API URL. This URL must be unique in each Azure region.
6262
1. Select **Deploy**. Wait until the deployment is ready and you're redirected to the Deployments page.
63-
1. Return to the Deployments page, select the deployment, and note the endpoint's **Target** URL and the Secret **Key**. For more information on using the APIs, see the [reference](#reference-for-timegen-1-deployed-as-a-service) section.
63+
1. Return to the Deployments page, select the deployment, and note the endpoint's **Target** URL and the Secret **Key**. For more information on using the APIs, see the [reference](#reference-for-timegen-1-deployed-as-a-serverless-api) section.
6464
1. You can always find the endpoint's details, URL, and access keys by navigating to your **Project overview** page. Then, from the left sidebar of your project, select **Components** > **Deployments**.
6565

66-
To learn about billing for the TimeGEN-1 model deployed as a serverless API with pay-as-you-go token-based billing, see [Cost and quota considerations for the TimeGEN-1 family of models deployed as a service](#cost-and-quota-considerations-for-timegen-1-deployed-as-a-service).
66+
To learn about billing for the TimeGEN-1 model deployed as a serverless API with pay-as-you-go token-based billing, see [Cost and quota considerations for the TimeGEN-1 family of models deployed as a service](#cost-and-quota-considerations-for-timegen-1-deployed-as-a-serverless-api).
6767

6868
### Consume the TimeGEN-1 model as a service
6969

@@ -82,12 +82,12 @@ You can consume TimeGEN-1 models by using the forecast API.
8282
|Quick Start Forecast|The Nixtla TimeGEN1 is a generative, pretrained forecasting model for time series data. TimeGEN1 can produce accurate forecasts for new time series without training, using only historical values as inputs.|[Quick Start Forecast](https://aka.ms/quick-start-forecasting)|
8383
|Fine-tuning|Fine-tuning is a powerful process to utilize Time-GEN1 more effectively. Foundation models - for example, TimeGEN1 - are pretrained on vast amounts of data, to capture wide-ranging features and patterns. These models can then be specialized for specific contexts or domains. Fine-tuning refines the model parameters to forecast a new task, allowing it to tailor its vast pre-existing knowledge towards the requirements of the new data. In this way, fine-tuning serves as a crucial bridge, linking the broad TimeGEN1 capabilities to the specifics of your tasks. Concretely, the fine-tuning process involves performing some training iterations on your input data, to minimize the forecasting error. The forecasts are produced with the updated model. To control the number of iterations, use the finetune_steps argument of the forecast method.|[Fine-tuning](https://aka.ms/finetuning-TimeGEN1)|
8484
|Anomaly Detection|Anomaly detection in time series data is important across various industries - for example, finance and healthcare. It involves monitoring ordered data points to spot irregularities that might signal issues or threats. Organizations can then swiftly act to prevent, improve, or safeguard their operations.|[Anomaly Detection](https://aka.ms/anomaly-detection)|
85-
|Exogenous Variables|Exogenous variables are external factors that can influence forecasts. These variables take on one of a limited, fixed number of possible values, and induce a grouping of your observations. For example, if youre forecasting daily product demand for a retailer, you could benefit from an event variable that may tell you what kind of event takes place on a given day, for example None, Sporting, or Cultural. Or you might also include external factors such as weather.|[Exogenous Variables](https://aka.ms/exogenous-variables)|
85+
|Exogenous Variables|Exogenous variables are external factors that can influence forecasts. These variables take on one of a limited, fixed number of possible values, and induce a grouping of your observations. For example, if you're forecasting daily product demand for a retailer, you could benefit from an event variable that may tell you what kind of event takes place on a given day, for example 'None', Sporting', or 'Cultural'. Or you might also include external factors such as weather.|[Exogenous Variables](https://aka.ms/exogenous-variables)|
8686
|Demand Forecasting|Demand forecasting involves application of historical data and other analytical information, to build models that help predict future estimates of customer demand, for specific products, over a specific time period. It helps shape product road map, inventory production, and inventory allocation, among other things.|[Demand Forecasting](https://aka.ms/demand-forecasting-with-TimeGEN1)|
8787

88-
For more information about use of the APIs, visit the [reference](#reference-for-timegen-1-deployed-as-a-service) section.
88+
For more information about use of the APIs, visit the [reference](#reference-for-timegen-1-deployed-as-a-serverless-api) section.
8989

90-
### Reference for TimeGEN-1 deployed as a service
90+
### Reference for TimeGEN-1 deployed as a serverless API
9191

9292
#### Forecast API
9393

@@ -229,9 +229,9 @@ This JSON sample is an example response:
229229

230230
## Cost and quotas
231231

232-
### Cost and quota considerations for TimeGEN-1 deployed as a service
232+
### Cost and quota considerations for TimeGEN-1 deployed as a serverless API
233233

234-
Nixtla offers TimeGEN-1 deployed as a service through the Azure Marketplace. TimeGEN-1 is integrated with Azure AI Studio for use. You can find more information about Azure Marketplace pricing when you deploy the model.
234+
Nixtla offers TimeGEN-1 deployed as a serverless API through the Azure Marketplace. TimeGEN-1 is integrated with Azure AI Studio for use. You can find more information about Azure Marketplace pricing when you deploy the model.
235235

236236
Each time a project subscribes to a given offer from the Azure Marketplace, a new resource is created to track the costs associated with its consumption. The same resource is used to track costs associated with inference; however, multiple meters are available to track each scenario independently.
237237

articles/machine-learning/how-to-deploy-models-cohere-command.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@ ms.subservice: inferencing
88
ms.topic: how-to
99
ms.date: 04/02/2024
1010
ms.reviewer: mopeakande
11+
reviewer: msakande
1112
ms.author: shubhiraj
1213
author: shubhirajMsft
1314
ms.custom: [references_regions]
@@ -101,7 +102,7 @@ To create a deployment:
101102

102103
1. Once you subscribe the workspace for the particular Azure Marketplace offering, subsequent deployments of the _same_ offering in the _same_ workspace don't require subscribing again. If this scenario applies to you, there's a **Continue to deploy** option to select.
103104

104-
:::image type="content" source="media/how-to-deploy-models-cohere-command/command-r-existing-subscription.png" alt-text="A screenshot showing a project that is already subscribed to the offering." lightbox="media/how-to-deploy-models-cohere-command/command-r-existing-subscription.png":::
105+
:::image type="content" source="media/how-to-deploy-models-cohere-command/command-r-existing-subscription.png" alt-text="A screenshot showing a workspace that is already subscribed to the offering." lightbox="media/how-to-deploy-models-cohere-command/command-r-existing-subscription.png":::
105106

106107
1. Give the deployment a name. This name becomes part of the deployment API URL. This URL must be unique in each Azure region.
107108

@@ -816,7 +817,7 @@ Each time a workspace subscribes to a given model offering from Azure Marketplac
816817

817818
For more information on how to track costs, see [Monitor costs for models offered through the Azure Marketplace](../ai-studio/how-to/costs-plan-manage.md#monitor-costs-for-models-offered-through-the-azure-marketplace).
818819

819-
Quota is managed per deployment. Each deployment has a rate limit of 200,000 tokens per minute and 1,000 API requests per minute. However, we currently limit one deployment per model per project. Contact Microsoft Azure Support if the current rate limits aren't sufficient for your scenarios.
820+
Quota is managed per deployment. Each deployment has a rate limit of 200,000 tokens per minute and 1,000 API requests per minute. However, we currently limit one deployment per model per workspace. Contact Microsoft Azure Support if the current rate limits aren't sufficient for your scenarios.
820821

821822
## Content filtering
822823

articles/machine-learning/how-to-deploy-models-cohere-embed.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@ ms.subservice: inferencing
88
ms.topic: how-to
99
ms.date: 04/02/2024
1010
ms.reviewer: mopeakande
11+
reviewer: msakande
1112
ms.author: shubhiraj
1213
author: shubhirajMsft
1314
ms.custom: [references_regions]
@@ -81,7 +82,7 @@ To create a deployment:
8182

8283
1. Once you subscribe the workspace for the particular Azure Marketplace offering, subsequent deployments of the _same_ offering in the _same_ workspace don't require subscribing again. If this scenario applies to you, there's a **Continue to deploy** option to select.
8384

84-
:::image type="content" source="media/how-to-deploy-models-cohere-embed/embed-english-existing-deployment.png" alt-text="A screenshot showing a project that is already subscribed to the offering." lightbox="media/how-to-deploy-models-cohere-embed/embed-english-existing-deployment.png":::
85+
:::image type="content" source="media/how-to-deploy-models-cohere-embed/embed-english-existing-deployment.png" alt-text="A screenshot showing a workspace that is already subscribed to the offering." lightbox="media/how-to-deploy-models-cohere-embed/embed-english-existing-deployment.png":::
8586

8687
1. Give the deployment a name. This name becomes part of the deployment API URL. This URL must be unique in each Azure region.
8788

@@ -350,7 +351,7 @@ Each time a workspace subscribes to a given model offering from Azure Marketplac
350351

351352
For more information on how to track costs, see [Monitor costs for models offered through the Azure Marketplace](../ai-studio/how-to/costs-plan-manage.md#monitor-costs-for-models-offered-through-the-azure-marketplace).
352353

353-
Quota is managed per deployment. Each deployment has a rate limit of 200,000 tokens per minute and 1,000 API requests per minute. However, we currently limit one deployment per model per project. Contact Microsoft Azure Support if the current rate limits aren't sufficient for your scenarios.
354+
Quota is managed per deployment. Each deployment has a rate limit of 200,000 tokens per minute and 1,000 API requests per minute. However, we currently limit one deployment per model per workspace. Contact Microsoft Azure Support if the current rate limits aren't sufficient for your scenarios.
354355

355356
## Content filtering
356357

articles/machine-learning/how-to-deploy-models-mistral.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -7,10 +7,10 @@ ms.service: machine-learning
77
ms.subservice: inferencing
88
ms.topic: how-to
99
ms.date: 04/29/2024
10-
mms.author: kritifaujdar
11-
.author: fkriti
12-
ms.author: mopeakande
13-
author: msakande
10+
ms.author: kritifaujdar
11+
author: fkriti
12+
ms.reviewer: mopeakande
13+
reviewer: msakande
1414
ms.custom: [references_regions]
1515

1616
#This functionality is also available in Azure AI Studio: /azure/ai-studio/how-to/deploy-models-mistral.md
@@ -95,7 +95,7 @@ To create a deployment:
9595

9696
1. Once you subscribe the workspace for the particular Azure Marketplace offering, subsequent deployments of the _same_ offering in the _same_ workspace don't require subscribing again. If this scenario applies to you, you'll see a **Continue to deploy** option to select.
9797

98-
:::image type="content" source="media/how-to-deploy-models-mistral/mistral-deploy-pay-as-you-go-project.png" alt-text="A screenshot showing a project that is already subscribed to the offering." lightbox="media/how-to-deploy-models-mistral/mistral-deploy-pay-as-you-go-project.png":::
98+
:::image type="content" source="media/how-to-deploy-models-mistral/mistral-deploy-pay-as-you-go-project.png" alt-text="A screenshot showing a workspace that is already subscribed to the offering." lightbox="media/how-to-deploy-models-mistral/mistral-deploy-pay-as-you-go-project.png":::
9999

100100
1. Give the deployment a name. This name becomes part of the deployment API URL. This URL must be unique in each Azure region.
101101

@@ -271,7 +271,7 @@ Each time a workspace subscribes to a given model offering from Azure Marketplac
271271

272272
For more information on how to track costs, see [Monitor costs for models offered through the Azure Marketplace](../ai-studio/how-to/costs-plan-manage.md#monitor-costs-for-models-offered-through-the-azure-marketplace).
273273

274-
Quota is managed per deployment. Each deployment has a rate limit of 200,000 tokens per minute and 1,000 API requests per minute. However, we currently limit one deployment per model per project. Contact Microsoft Azure Support if the current rate limits aren't sufficient for your scenarios.
274+
Quota is managed per deployment. Each deployment has a rate limit of 200,000 tokens per minute and 1,000 API requests per minute. However, we currently limit one deployment per model per workspace. Contact Microsoft Azure Support if the current rate limits aren't sufficient for your scenarios.
275275

276276
## Content filtering
277277

0 commit comments

Comments
 (0)