You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/ai-foundry/concepts/deployments-overview.md
+10-10Lines changed: 10 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,15 +5,15 @@ description: Learn about deploying models in Azure AI Foundry portal.
5
5
manager: scottpolly
6
6
ms.service: azure-ai-foundry
7
7
ms.topic: concept-article
8
-
ms.date: 3/20/2024
8
+
ms.date: 03/24/2025
9
9
ms.reviewer: fasantia
10
10
ms.author: mopeakande
11
11
author: msakande
12
12
---
13
13
14
14
# Overview: Deploy AI models in Azure AI Foundry portal
15
15
16
-
The model catalog in Azure AI Foundry portal is the hub to discover and use a wide range of models for building generative AI applications. Models need to be deployed to make them available for receiving inference requests. Azure AI Foundry offers a comprehensive suite of deployment options for those models depending on your needs and model requirements.
16
+
The model catalog in Azure AI Foundry portal is the hub to discover and use a wide range of models for building generative AI applications. Models need to be deployed to make them available for receiving inference requests. Azure AI Foundry offers a comprehensive suite of deployment options for models, depending on your needs and model requirements.
17
17
18
18
## Deploying models
19
19
@@ -35,8 +35,8 @@ Azure AI Foundry offers four different deployment options:
35
35
| Content filtering | Yes | Yes | Yes | No |
36
36
| Custom content filtering | Yes | Yes | No | No |
37
37
| Key-less authentication | Yes | Yes | No | No |
38
-
| Best suited when | You are planning to use only OpenAI models | You are planning to take advantage of the flagship models in Azure AI catalog, including OpenAI. | You are planning to use a single model from a specific provider (excluding OpenAI). | If you plan to use open models and you have enough compute quota available in your subscription. |
| Best suited when | You're planning to use only OpenAI models | You're planning to take advantage of the flagship models in Azure AI catalog, including OpenAI. | You're planning to use a single model from a specific provider (excluding OpenAI). | If you plan to use open models and you have enough compute quota available in your subscription. |
| Deployment instructions |[Deploy to Azure OpenAI Service](../how-to/deploy-models-openai.md)|[Deploy to Azure AI model inference](../model-inference/how-to/create-model-deployments.md)|[Deploy to Serverless API](../how-to/deploy-models-serverless.md)|[Deploy to Managed compute](../how-to/deploy-models-managed.md)|
41
41
42
42
<sup>1</sup> A minimal endpoint infrastructure is billed per minute. You aren't billed for the infrastructure that hosts the model in pay-as-you-go. After you delete the endpoint, no further charges accrue.
@@ -48,17 +48,17 @@ Azure AI Foundry offers four different deployment options:
48
48
49
49
### How should I think about deployment options?
50
50
51
-
Azure AI Foundry encourages customers to explore the deployment options and pick the one that best suites their business and technical needs. In general you can use the following thinking process:
51
+
Azure AI Foundry encourages you to explore various deployment options and choose the one that best suites your business and technical needs. In general, Consider using the following approach to select a deployment option:
52
52
53
-
* Start with [Azure AI model inference](../../ai-foundry/model-inference/overview.md) which is the option with the bigger scope. This allows you to iterate and prototype faster in your application without having to rebuild your architecture each time you decide to change something. If you are using Azure AI Foundry Hubs or Projects, enable it by [turning on Azure AI model inference](../../ai-foundry/model-inference/how-to/quickstart-ai-project.md).
53
+
* Start with [Azure AI model inference](../../ai-foundry/model-inference/overview.md), which is the option with the largest scope. This option allows you to iterate and prototype faster in your application without having to rebuild your architecture each time you decide to change something. If you're using Azure AI Foundry hubs or projects, enable this option by [turning on the Azure AI model inference feature](../model-inference/how-to/quickstart-ai-project.md#configure-the-project-to-use-azure-ai-model-inference).
54
54
55
-
* When you are looking to use a specific model:
55
+
* When you're looking to use a specific model:
56
56
57
-
*When you are interested in Azure OpenAI models, use the Azure OpenAI Service which offers a wide range of capabilities for them and it's designed for them.
57
+
*If you're interested in Azure OpenAI models, use the Azure OpenAI Service. This option is designed for Azure OpenAI models and offers a wide range of capabilities for them.
58
58
59
-
*When you are interested in a particular model from Models-as-a-Service, and you don't expect to use any other type of model, use [Serverless API endpoints](../how-to/deploy-models-serverless.md). They allow deployment of a single model under a unique set of endpoint URL and keys.
59
+
*If you're interested in a particular model from Models-as-a-Service, and you don't expect to use any other type of model, use [Serverless API endpoints](../how-to/deploy-models-serverless.md). Serverless endpoints allow deployment of a single model under a unique set of endpoint URL and keys.
60
60
61
-
* When your model is not available in Models-as-a-Service and you have compute quota available in your subscription, use [Managed Compute](../how-to/deploy-models-managed.md) which support deployment of open and custom models. It also allows high level of customization of the deployment inference server, protocols, and detailed configuration.
61
+
* When your model isn't available in Models-as-a-Service and you have compute quota available in your subscription, use [Managed Compute](../how-to/deploy-models-managed.md), which supports deployment of open and custom models. It also allows a high level of customization of the deployment inference server, protocols, and detailed configuration.
@@ -39,121 +39,119 @@ You can deploy managed compute models using the Azure Machine Learning SDK, but
39
39
40
40
## Deploy the model
41
41
42
-
Let's deploy the model.
43
-
44
-
First, you need to install the Azure Machine Learning SDK.
45
-
46
-
```python
47
-
pip install azure-ai-ml
48
-
pip install azure-identity
49
-
```
50
-
51
-
Use this code to authenticate with Azure Machine Learning and create a client object. Replace the placeholders with your subscription ID, resource group name, and Azure AI Foundry project name.
52
-
53
-
```python
54
-
from azure.ai.ml import MLClient
55
-
from azure.identity import InteractiveBrowserCredential
56
-
57
-
client = MLClient(
58
-
credential=InteractiveBrowserCredential,
59
-
subscription_id="your subscription name goes here",
60
-
resource_group_name="your resource group name goes here",
61
-
workspace_name="your project name goes here",
62
-
)
63
-
```
64
-
65
-
For the managed compute deployment option, you need to create an endpoint before a model deployment. Think of an endpoint as a container that can house multiple model deployments. The endpoint names need to be unique in a region, so in this example we're using the timestamp to create a unique endpoint name.
66
-
67
-
```python
68
-
import time, sys
69
-
from azure.ai.ml.entities import (
70
-
ManagedOnlineEndpoint,
71
-
ManagedOnlineDeployment,
72
-
ProbeSettings,
73
-
)
74
-
75
-
# Make the endpoint name unique
76
-
timestamp =int(time.time())
77
-
online_endpoint_name ="customize your endpoint name here"+str(timestamp)
1. Authenticate with Azure Machine Learning and create a client object. Replace the placeholders with your subscription ID, resource group name, and Azure AI Foundry project name.
50
+
51
+
```python
52
+
from azure.ai.ml import MLClient
53
+
from azure.identity import InteractiveBrowserCredential
54
+
55
+
workspace_ml_client = MLClient(
56
+
credential=InteractiveBrowserCredential,
57
+
subscription_id="your subscription name goes here",
58
+
resource_group_name="your resource group name goes here",
59
+
workspace_name="your project name goes here",
60
+
)
61
+
```
62
+
63
+
1. Create an endpoint. For the managed compute deployment option, you need to create an endpoint before a model deployment. Think of an endpoint as a container that can house multiple model deployments. The endpoint names need to be unique in a region, so in this example use the timestamp to create a unique endpoint name.
64
+
65
+
```python
66
+
import time, sys
67
+
from azure.ai.ml.entities import (
68
+
ManagedOnlineEndpoint,
69
+
ManagedOnlineDeployment,
70
+
ProbeSettings,
71
+
)
72
+
73
+
# Make the endpoint name unique
74
+
timestamp =int(time.time())
75
+
online_endpoint_name ="customize your endpoint name here"+str(timestamp)
1. Create a deployment. Replace the model IDin the next code with the model ID that you copied from the details page of the model you selected in the [Get the model ID](#get-the-model-id) section.
You need a sample json data to test inferencing. Create `sample_score.json` with the following example.
121
-
122
-
```python
123
-
{
124
-
"inputs": {
125
-
"question": [
126
-
"Where do I live?",
127
-
"Where do I live?",
128
-
"What's my name?",
129
-
"Which name is also used to describe the Amazon rainforest in English?"
130
-
],
131
-
"context": [
132
-
"My name is Wolfgang and I live in Berlin",
133
-
"My name is Sarah and I live in London",
134
-
"My name is Clara and I live in Berkeley.",
135
-
"The Amazon rainforest (Portuguese: Floresta Amaz\u00f4nica or Amaz\u00f4nia; Spanish: Selva Amaz\u00f3nica, Amazon\u00eda or usually Amazonia; French: For\u00eat amazonienne; Dutch: Amazoneregenwoud), also known in English as Amazonia or the Amazon Jungle, is a moist broadleaf forest that covers most of the Amazon basin of South America. This basin encompasses 7,000,000 square kilometres (2,700,000 sq mi), of which 5,500,000 square kilometres (2,100,000 sq mi) are covered by the rainforest. This region includes territory belonging to nine nations. The majority of the forest is contained within Brazil, with 60% of the rainforest, followed by Peru with 13%, Colombia with 10%, and with minor amounts in Venezuela, Ecuador, Bolivia, Guyana, Suriname and French Guiana. States or departments in four nations contain \"Amazonas\" in their names. The Amazon represents over half of the planet's remaining rainforests, and comprises the largest and most biodiverse tract of tropical rainforest in the world, with an estimated 390 billion individual trees divided into 16,000 species."
136
-
]
137
-
}
138
-
}
139
-
```
140
-
141
-
Let's inference with `sample_score.json`. Change the location based on where you saved your sample json file.
1. You need a sample json data to test inferencing. Create `sample_score.json`with the following example.
119
+
120
+
```python
121
+
{
122
+
"inputs": {
123
+
"question": [
124
+
"Where do I live?",
125
+
"Where do I live?",
126
+
"What's my name?",
127
+
"Which name is also used to describe the Amazon rainforest in English?"
128
+
],
129
+
"context": [
130
+
"My name is Wolfgang and I live in Berlin",
131
+
"My name is Sarah and I live in London",
132
+
"My name is Clara and I live in Berkeley.",
133
+
"The Amazon rainforest (Portuguese: Floresta Amaz\u00f4nica or Amaz\u00f4nia; Spanish: Selva Amaz\u00f3nica, Amazon\u00eda or usually Amazonia; French: For\u00eat amazonienne; Dutch: Amazoneregenwoud), also known in English as Amazonia or the Amazon Jungle, is a moist broadleaf forest that covers most of the Amazon basin of South America. This basin encompasses 7,000,000 square kilometres (2,700,000 sq mi), of which 5,500,000 square kilometres (2,100,000 sq mi) are covered by the rainforest. This region includes territory belonging to nine nations. The majority of the forest is contained within Brazil, with 60% of the rainforest, followed by Peru with 13%, Colombia with 10%, and with minor amounts in Venezuela, Ecuador, Bolivia, Guyana, Suriname and French Guiana. States or departments in four nations contain \"Amazonas\" in their names. The Amazon represents over half of the planet's remaining rainforests, and comprises the largest and most biodiverse tract of tropical rainforest in the world, with an estimated 390 billion individual trees divided into 16,000 species."
134
+
]
135
+
}
136
+
}
137
+
```
138
+
139
+
1. Inference with`sample_score.json`. Change the location of the scoring filein the next code, based on where you saved your sample json file.
To configure autoscaling for deployments, you can go to Azure Portal, locate the Azure resource typed `Machine learning online deployment` in the resource group of the AI project, and use Scaling menu under Setting. For more information on autoscaling, see [Autoscale online endpoints](/azure/machine-learning/how-to-autoscale-endpoints) in the Azure Machine Learning documentation.
154
+
To configure autoscaling for deployments, you can go to Azure portal, locate the Azure resource typed `Machine learning online deployment`in the resource group of the AI project, and use Scaling menu under Setting. For more information on autoscaling, see [Autoscale online endpoints](/azure/machine-learning/how-to-autoscale-endpoints) in the Azure Machine Learning documentation.
157
155
158
156
## Delete the deployment endpoint
159
157
@@ -163,7 +161,7 @@ To delete deployments in Azure AI Foundry portal, select the **Delete** button o
163
161
164
162
To deploy and perform inferencing with real-time endpoints, you consume Virtual Machine (VM) core quota that is assigned to your subscription on a per-region basis. When you sign up for Azure AI Foundry, you receive a default VM quota for several VM families available in the region. You can continue to create deployments until you reach your quota limit. Once that happens, you can request for a quota increase.
165
163
166
-
## Next steps
164
+
## Related content
167
165
168
166
- Learn more about what you can do in [Azure AI Foundry](../what-is-ai-foundry.md)
169
167
- Get answers to frequently asked questions in the [Azure AIFAQ article](../faq.yml)
0 commit comments