You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/ai-studio/how-to/deploy-models-phi-3.md
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -52,7 +52,7 @@ Certain models in the model catalog can be deployed as a serverless API with pay
52
52
- An [Azure AI Studio hub](../how-to/create-azure-ai-resource.md).
53
53
54
54
> [!IMPORTANT]
55
-
> For Phi-3 family models, the pay-as-you-go model deployment offering is only available with hubs created in **East US 2** and **Sweden Central** regions.
55
+
> For Phi-3 family models, the serverless API model deployment offering is only available with hubs created in **East US 2** and **Sweden Central** regions.
56
56
57
57
- An [Azure AI Studio project](../how-to/create-projects.md).
58
58
- Azure role-based access controls (Azure RBAC) are used to grant access to operations in Azure AI Studio. To perform the steps in this article, your user account must be assigned the __Azure AI Developer role__ on the resource group. For more information on permissions, see [Role-based access control in Azure AI Studio](../concepts/rbac-ai-studio.md).
@@ -96,7 +96,7 @@ To create a deployment:
96
96
97
97
### Consume Phi-3 models as a service
98
98
99
-
Models deployed as a service can be consumed using the chat API, depending on the type of model you deployed.
99
+
Models deployed as serverless APIs can be consumed using the chat API, depending on the type of model you deployed.
100
100
101
101
1. From your **Project overview** page, go to the left sidebar and select **Components** > **Deployments**.
102
102
@@ -108,7 +108,7 @@ Models deployed as a service can be consumed using the chat API, depending on th
108
108
109
109
## Cost and quotas
110
110
111
-
### Cost and quota considerations for Phi-3 models deployed as a service
111
+
### Cost and quota considerations for Phi-3 models deployed as serverless APIs
112
112
113
113
You can find the pricing information on the **Pricing and terms** tab of the deployment wizard when deploying the model.
In this article, you learn how to use Azure AI Studio to deploy the TimeGEN-1 model as a service with pay-asyougo billing.
21
+
In this article, you learn how to use Azure AI Studio to deploy the TimeGEN-1 model as a serverless API with pay-as-you-go billing.
22
22
You filter on the Nixtla collection to browse the TimeGEN-1 model in the [Model Catalog](model-catalog.md).
23
23
24
24
The Nixtla TimeGEN-1 is a generative, pretrained forecasting and anomaly detection model for time series data. TimeGEN-1 can produce accurate forecasts for new time series without training, using only historical values and exogenous covariates as inputs.
@@ -27,15 +27,15 @@ The Nixtla TimeGEN-1 is a generative, pretrained forecasting and anomaly detecti
27
27
28
28
Certain models in the model catalog can be deployed as a serverless API with pay-as-you-go billing. This kind of deployment provides a way to consume models as an API without hosting them on your subscription, while keeping the enterprise security and compliance that organizations need. This deployment option doesn't require quota from your subscription.
29
29
30
-
You can deploy TimeGEN-1 as a service with pay-as-you-go. Nixtla offers TimeGEN-1 through the Microsoft Azure Marketplace. Nixtla can change or update the terms of use and pricing of this model.
30
+
You can deploy TimeGEN-1 as a serverless API with pay-as-you-go billing. Nixtla offers TimeGEN-1 through the Microsoft Azure Marketplace. Nixtla can change or update the terms of use and pricing of this model.
31
31
32
32
### Prerequisites
33
33
34
34
- An Azure subscription with a valid payment method. Free or trial Azure subscriptions don't work. If you don't have an Azure subscription, create a [paid Azure account](https://azure.microsoft.com/pricing/purchase-options/pay-as-you-go) to begin.
35
35
- An [AI Studio hub](../how-to/create-azure-ai-resource.md).
36
36
37
37
> [!IMPORTANT]
38
-
> The pay-as-you-go model deployment offering for TimeGEN1 is only available with hubs created in the **East US 2** or **Sweden Central** regions.
38
+
> The serverless API model deployment offering for TimeGEN-1 is only available with hubs created in the **East US 2** or **Sweden Central** regions.
39
39
40
40
- An [Azure AI Studio project](../how-to/create-projects.md).
41
41
- Azure role-based access controls (Azure RBAC) are used to grant access to operations in Azure AI Studio. To perform the steps in this article, your user account must be assigned the __Azure AI Developer role__ on the resource group. For more information on permissions, visit [Role-based access control in Azure AI Studio](../concepts/rbac-ai-studio.md).
@@ -60,10 +60,10 @@ These steps demonstrate the deployment of TimeGEN-1. To create a deployment:
60
60
1. Once you subscribe the project for the particular Azure Marketplace offering, subsequent deployments of the _same_ offering in the _same_ project don't require subscribing again. If this scenario applies to you, there's a **Continue to deploy** option to select.
61
61
1. Give the deployment a name. This name becomes part of the deployment API URL. This URL must be unique in each Azure region.
62
62
1. Select **Deploy**. Wait until the deployment is ready and you're redirected to the Deployments page.
63
-
1. Return to the Deployments page, select the deployment, and note the endpoint's **Target** URL and the Secret **Key**. For more information on using the APIs, see the [reference](#reference-for-timegen-1-deployed-as-a-service) section.
63
+
1. Return to the Deployments page, select the deployment, and note the endpoint's **Target** URL and the Secret **Key**. For more information on using the APIs, see the [reference](#reference-for-timegen-1-deployed-as-a-serverless-api) section.
64
64
1. You can always find the endpoint's details, URL, and access keys by navigating to your **Project overview** page. Then, from the left sidebar of your project, select **Components** > **Deployments**.
65
65
66
-
To learn about billing for the TimeGEN-1 model deployed as a serverless API with pay-as-you-go token-based billing, see [Cost and quota considerations for the TimeGEN-1 family of models deployed as a service](#cost-and-quota-considerations-for-timegen-1-deployed-as-a-service).
66
+
To learn about billing for the TimeGEN-1 model deployed as a serverless API with pay-as-you-go token-based billing, see [Cost and quota considerations for the TimeGEN-1 family of models deployed as a service](#cost-and-quota-considerations-for-timegen-1-deployed-as-a-serverless-api).
67
67
68
68
### Consume the TimeGEN-1 model as a service
69
69
@@ -82,12 +82,12 @@ You can consume TimeGEN-1 models by using the forecast API.
82
82
|Quick Start Forecast|The Nixtla TimeGEN1 is a generative, pretrained forecasting model for time series data. TimeGEN1 can produce accurate forecasts for new time series without training, using only historical values as inputs.|[Quick Start Forecast](https://aka.ms/quick-start-forecasting)|
83
83
|Fine-tuning|Fine-tuning is a powerful process to utilize Time-GEN1 more effectively. Foundation models - for example, TimeGEN1 - are pretrained on vast amounts of data, to capture wide-ranging features and patterns. These models can then be specialized for specific contexts or domains. Fine-tuning refines the model parameters to forecast a new task, allowing it to tailor its vast pre-existing knowledge towards the requirements of the new data. In this way, fine-tuning serves as a crucial bridge, linking the broad TimeGEN1 capabilities to the specifics of your tasks. Concretely, the fine-tuning process involves performing some training iterations on your input data, to minimize the forecasting error. The forecasts are produced with the updated model. To control the number of iterations, use the finetune_steps argument of the forecast method.|[Fine-tuning](https://aka.ms/finetuning-TimeGEN1)|
84
84
|Anomaly Detection|Anomaly detection in time series data is important across various industries - for example, finance and healthcare. It involves monitoring ordered data points to spot irregularities that might signal issues or threats. Organizations can then swiftly act to prevent, improve, or safeguard their operations.|[Anomaly Detection](https://aka.ms/anomaly-detection)|
85
-
|Exogenous Variables|Exogenous variables are external factors that can influence forecasts. These variables take on one of a limited, fixed number of possible values, and induce a grouping of your observations. For example, if you’re forecasting daily product demand for a retailer, you could benefit from an event variable that may tell you what kind of event takes place on a given day, for example ‘None’, Sporting’, or ‘Cultural’. Or you might also include external factors such as weather.|[Exogenous Variables](https://aka.ms/exogenous-variables)|
85
+
|Exogenous Variables|Exogenous variables are external factors that can influence forecasts. These variables take on one of a limited, fixed number of possible values, and induce a grouping of your observations. For example, if you're forecasting daily product demand for a retailer, you could benefit from an event variable that may tell you what kind of event takes place on a given day, for example 'None', Sporting', or 'Cultural'. Or you might also include external factors such as weather.|[Exogenous Variables](https://aka.ms/exogenous-variables)|
86
86
|Demand Forecasting|Demand forecasting involves application of historical data and other analytical information, to build models that help predict future estimates of customer demand, for specific products, over a specific time period. It helps shape product road map, inventory production, and inventory allocation, among other things.|[Demand Forecasting](https://aka.ms/demand-forecasting-with-TimeGEN1)|
87
87
88
-
For more information about use of the APIs, visit the [reference](#reference-for-timegen-1-deployed-as-a-service) section.
88
+
For more information about use of the APIs, visit the [reference](#reference-for-timegen-1-deployed-as-a-serverless-api) section.
89
89
90
-
### Reference for TimeGEN-1 deployed as a service
90
+
### Reference for TimeGEN-1 deployed as a serverless API
91
91
92
92
#### Forecast API
93
93
@@ -229,9 +229,9 @@ This JSON sample is an example response:
229
229
230
230
## Cost and quotas
231
231
232
-
### Cost and quota considerations for TimeGEN-1 deployed as a service
232
+
### Cost and quota considerations for TimeGEN-1 deployed as a serverless API
233
233
234
-
Nixtla offers TimeGEN-1 deployed as a service through the Azure Marketplace. TimeGEN-1 is integrated with Azure AI Studio for use. You can find more information about Azure Marketplace pricing when you deploy the model.
234
+
Nixtla offers TimeGEN-1 deployed as a serverless API through the Azure Marketplace. TimeGEN-1 is integrated with Azure AI Studio for use. You can find more information about Azure Marketplace pricing when you deploy the model.
235
235
236
236
Each time a project subscribes to a given offer from the Azure Marketplace, a new resource is created to track the costs associated with its consumption. The same resource is used to track costs associated with inference; however, multiple meters are available to track each scenario independently.
Copy file name to clipboardExpand all lines: articles/machine-learning/how-to-deploy-models-cohere-command.md
+3-2Lines changed: 3 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,6 +8,7 @@ ms.subservice: inferencing
8
8
ms.topic: how-to
9
9
ms.date: 04/02/2024
10
10
ms.reviewer: mopeakande
11
+
reviewer: msakande
11
12
ms.author: shubhiraj
12
13
author: shubhirajMsft
13
14
ms.custom: [references_regions]
@@ -101,7 +102,7 @@ To create a deployment:
101
102
102
103
1. Once you subscribe the workspace for the particular Azure Marketplace offering, subsequent deployments of the _same_ offering in the _same_ workspace don't require subscribing again. If this scenario applies to you, there's a **Continue to deploy** option to select.
103
104
104
-
:::image type="content" source="media/how-to-deploy-models-cohere-command/command-r-existing-subscription.png" alt-text="A screenshot showing a project that is already subscribed to the offering." lightbox="media/how-to-deploy-models-cohere-command/command-r-existing-subscription.png":::
105
+
:::image type="content" source="media/how-to-deploy-models-cohere-command/command-r-existing-subscription.png" alt-text="A screenshot showing a workspace that is already subscribed to the offering." lightbox="media/how-to-deploy-models-cohere-command/command-r-existing-subscription.png":::
105
106
106
107
1. Give the deployment a name. This name becomes part of the deployment API URL. This URL must be unique in each Azure region.
107
108
@@ -816,7 +817,7 @@ Each time a workspace subscribes to a given model offering from Azure Marketplac
816
817
817
818
For more information on how to track costs, see [Monitor costs for models offered through the Azure Marketplace](../ai-studio/how-to/costs-plan-manage.md#monitor-costs-for-models-offered-through-the-azure-marketplace).
818
819
819
-
Quota is managed per deployment. Each deployment has a rate limit of 200,000 tokens per minute and 1,000 API requests per minute. However, we currently limit one deployment per model per project. Contact Microsoft Azure Support if the current rate limits aren't sufficient for your scenarios.
820
+
Quota is managed per deployment. Each deployment has a rate limit of 200,000 tokens per minute and 1,000 API requests per minute. However, we currently limit one deployment per model per workspace. Contact Microsoft Azure Support if the current rate limits aren't sufficient for your scenarios.
Copy file name to clipboardExpand all lines: articles/machine-learning/how-to-deploy-models-cohere-embed.md
+3-2Lines changed: 3 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,6 +8,7 @@ ms.subservice: inferencing
8
8
ms.topic: how-to
9
9
ms.date: 04/02/2024
10
10
ms.reviewer: mopeakande
11
+
reviewer: msakande
11
12
ms.author: shubhiraj
12
13
author: shubhirajMsft
13
14
ms.custom: [references_regions]
@@ -81,7 +82,7 @@ To create a deployment:
81
82
82
83
1. Once you subscribe the workspace for the particular Azure Marketplace offering, subsequent deployments of the _same_ offering in the _same_ workspace don't require subscribing again. If this scenario applies to you, there's a **Continue to deploy** option to select.
83
84
84
-
:::image type="content" source="media/how-to-deploy-models-cohere-embed/embed-english-existing-deployment.png" alt-text="A screenshot showing a project that is already subscribed to the offering." lightbox="media/how-to-deploy-models-cohere-embed/embed-english-existing-deployment.png":::
85
+
:::image type="content" source="media/how-to-deploy-models-cohere-embed/embed-english-existing-deployment.png" alt-text="A screenshot showing a workspace that is already subscribed to the offering." lightbox="media/how-to-deploy-models-cohere-embed/embed-english-existing-deployment.png":::
85
86
86
87
1. Give the deployment a name. This name becomes part of the deployment API URL. This URL must be unique in each Azure region.
87
88
@@ -350,7 +351,7 @@ Each time a workspace subscribes to a given model offering from Azure Marketplac
350
351
351
352
For more information on how to track costs, see [Monitor costs for models offered through the Azure Marketplace](../ai-studio/how-to/costs-plan-manage.md#monitor-costs-for-models-offered-through-the-azure-marketplace).
352
353
353
-
Quota is managed per deployment. Each deployment has a rate limit of 200,000 tokens per minute and 1,000 API requests per minute. However, we currently limit one deployment per model per project. Contact Microsoft Azure Support if the current rate limits aren't sufficient for your scenarios.
354
+
Quota is managed per deployment. Each deployment has a rate limit of 200,000 tokens per minute and 1,000 API requests per minute. However, we currently limit one deployment per model per workspace. Contact Microsoft Azure Support if the current rate limits aren't sufficient for your scenarios.
Copy file name to clipboardExpand all lines: articles/machine-learning/how-to-deploy-models-mistral.md
+6-6Lines changed: 6 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,10 +7,10 @@ ms.service: machine-learning
7
7
ms.subservice: inferencing
8
8
ms.topic: how-to
9
9
ms.date: 04/29/2024
10
-
mms.author: kritifaujdar
11
-
.author: fkriti
12
-
ms.author: mopeakande
13
-
author: msakande
10
+
ms.author: kritifaujdar
11
+
author: fkriti
12
+
ms.reviewer: mopeakande
13
+
reviewer: msakande
14
14
ms.custom: [references_regions]
15
15
16
16
#This functionality is also available in Azure AI Studio: /azure/ai-studio/how-to/deploy-models-mistral.md
@@ -95,7 +95,7 @@ To create a deployment:
95
95
96
96
1. Once you subscribe the workspace for the particular Azure Marketplace offering, subsequent deployments of the _same_ offering in the _same_ workspace don't require subscribing again. If this scenario applies to you, you'll see a **Continue to deploy** option to select.
97
97
98
-
:::image type="content" source="media/how-to-deploy-models-mistral/mistral-deploy-pay-as-you-go-project.png" alt-text="A screenshot showing a project that is already subscribed to the offering." lightbox="media/how-to-deploy-models-mistral/mistral-deploy-pay-as-you-go-project.png":::
98
+
:::image type="content" source="media/how-to-deploy-models-mistral/mistral-deploy-pay-as-you-go-project.png" alt-text="A screenshot showing a workspace that is already subscribed to the offering." lightbox="media/how-to-deploy-models-mistral/mistral-deploy-pay-as-you-go-project.png":::
99
99
100
100
1. Give the deployment a name. This name becomes part of the deployment API URL. This URL must be unique in each Azure region.
101
101
@@ -271,7 +271,7 @@ Each time a workspace subscribes to a given model offering from Azure Marketplac
271
271
272
272
For more information on how to track costs, see [Monitor costs for models offered through the Azure Marketplace](../ai-studio/how-to/costs-plan-manage.md#monitor-costs-for-models-offered-through-the-azure-marketplace).
273
273
274
-
Quota is managed per deployment. Each deployment has a rate limit of 200,000 tokens per minute and 1,000 API requests per minute. However, we currently limit one deployment per model per project. Contact Microsoft Azure Support if the current rate limits aren't sufficient for your scenarios.
274
+
Quota is managed per deployment. Each deployment has a rate limit of 200,000 tokens per minute and 1,000 API requests per minute. However, we currently limit one deployment per model per workspace. Contact Microsoft Azure Support if the current rate limits aren't sufficient for your scenarios.
0 commit comments