You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/ai-studio/how-to/model-catalog-overview.md
+11-11Lines changed: 11 additions & 11 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -57,10 +57,10 @@ The deployment options and features available for each model vary, as described
57
57
58
58
Features | Managed compute | Serverless API (pay-as-you-go)
59
59
--|--|--
60
-
Deployment experience and billing | Model weights are deployed to dedicated virtual machines with managed online endpoints. A managed online endpoint, which can have one or more deployments, makes available a REST API for inference. You're billed for the virtual machine core hours that the deployments use. | Access to models is through a deployment that provisions an API to access the model. The API provides access to the model that Microsoft hosts and manages, for inference. You're billed for inputs and outputs to the APIs, typically in tokens. Pricing information is provided before you deploy.
60
+
Deployment experience and billing | Model weights are deployed to dedicated virtual machines with managed compute. A managed compute, which can have one or more deployments, makes available a REST API for inference. You're billed for the virtual machine core hours that the deployments use. | Access to models is through a deployment that provisions an API to access the model. The API provides access to the model that Microsoft hosts and manages, for inference. You're billed for inputs and outputs to the APIs, typically in tokens. Pricing information is provided before you deploy.
61
61
API authentication | Keys and Microsoft Entra authentication. | Keys only.
62
62
Content safety | Use Azure AI Content Safety service APIs. | Azure AI Content Safety filters are available integrated with inference APIs. Azure AI Content Safety filters are billed separately.
63
-
Network isolation | [Configure managed networks for Azure AI Studio hubs](configure-managed-network.md). | Endpoints follow your hub's public network access (PNA) flag setting. For more information, see the [Network isolation for models deployed via Serverless APIs](#network-isolation-for-models-deployed-via-serverless-apis) section later in this article.
63
+
Network isolation | [Configure managed networks for Azure AI Studio hubs](configure-managed-network.md). | Managed compute follow your hub's public network access (PNA) flag setting. For more information, see the [Network isolation for models deployed via Serverless APIs](#network-isolation-for-models-deployed-via-serverless-apis) section later in this article.
64
64
65
65
Model | Managed compute | Serverless API (pay-as-you-go)
66
66
--|--|--
@@ -74,7 +74,7 @@ Other models | Available | Not available
74
74
75
75
<!-- docutune:enable -->
76
76
77
-
:::image type="content" source="../media/explore/platform-service-cycle.png" alt-text="Diagram that shows models as a service and the service cycle of real-time endpoints." lightbox="../media/explore/platform-service-cycle.png":::
77
+
:::image type="content" source="../media/explore/platform-service-cycle.png" alt-text="Diagram that shows models as a service and the service cycle of managed computes." lightbox="../media/explore/platform-service-cycle.png":::
78
78
79
79
## Managed compute
80
80
@@ -94,7 +94,7 @@ The registries build on top of a highly scalable and enterprise-ready infrastruc
94
94
95
95
### Deployment of models for inference with managed compute
96
96
97
-
Models available for deployment to managed compute can be deployed to Azure Machine Learning online endpoints for real-time inference. Deploying to managed compute requires you to have a virtual machine quota in your Azure subscription for the specific products that you need to optimally run the model. Some models allow you to deploy to a [temporarily shared quota for model testing](deploy-models-open.md).
97
+
Models available for deployment to managed compute can be deployed to Azure Machine Learning managed compute for real-time inference. Deploying to managed compute requires you to have a virtual machine quota in your Azure subscription for the specific products that you need to optimally run the model. Some models allow you to deploy to a [temporarily shared quota for model testing](deploy-models-open.md).
98
98
99
99
Learn more about deploying models:
100
100
@@ -151,25 +151,25 @@ Pay-as-you-go billing is available only to users whose Azure subscription belong
151
151
152
152
### Network isolation for models deployed via serverless APIs
153
153
154
-
Endpoints for models deployed as serverless APIs follow the PNA flag setting of the AI Studio hub that has the project in which the deployment exists. To help secure your MaaS endpoint, disable the PNA flag on your AI Studio hub. You can help secure inbound communication from a client to your endpoint by using a private endpoint for the hub.
154
+
Managed computes for models deployed as serverless APIs follow the public network access flag setting of the AI Studio hub that has the project in which the deployment exists. To help secure your managed compute, disable the public network access flag on your AI Studio hub. You can help secure inbound communication from a client to your managed compute by using a private endpoint for the hub.
155
155
156
-
To set the PNA flag for the AI Studio hub:
156
+
To set the public network access flag for the AI Studio hub:
157
157
158
158
* Go to the [Azure portal](https://ms.portal.azure.com/).
159
159
* Search for the resource group to which the hub belongs, and select your AI Studio hub from the resources listed for this resource group.
160
160
* On the hub overview page, on the left pane, go to **Settings** > **Networking**.
161
-
* On the **Public access** tab, you can configure settings for the PNA flag.
161
+
* On the **Public access** tab, you can configure settings for the public network access flag.
162
162
* Save your changes. Your changes might take up to five minutes to propagate.
163
163
164
164
#### Limitations
165
165
166
-
* If you have an AI Studio hub with a private endpoint created before July 11, 2024, new MaaS endpoints added to projects in this hub won't follow the networking configuration of the hub. Instead, you need to create a new private endpoint for the hub and create new serverless API deployments in the project so that the new deployments can follow the hub's networking configuration.
166
+
* If you have an AI Studio hub with a managed compute created before July 11, 2024, managed computes added to projects in this hub won't follow the networking configuration of the hub. Instead, you need to create a new managed compute for the hub and create new serverless API deployments in the project so that the new deployments can follow the hub's networking configuration.
167
167
168
-
* If you have an AI Studio hub with MaaS deployments created before July 11, 2024, and you enable a private endpoint on this hub, the existing MaaS deployments won't follow the hub's networking configuration. For serverless API deployments in the hub to follow the hub's networking configuration, you need to create the deployments again.
168
+
* If you have an AI Studio hub with MaaS deployments created before July 11, 2024, and you enable a managed compute on this hub, the existing MaaS deployments won't follow the hub's networking configuration. For serverless API deployments in the hub to follow the hub's networking configuration, you need to create the deployments again.
169
169
170
-
* Currently, [Azure OpenAI On Your Data](/azure/ai-services/openai/concepts/use-your-data) support isn't available for MaaS deployments in private hubs, because private hubs have the PNA flag disabled.
170
+
* Currently, [Azure OpenAI On Your Data](/azure/ai-services/openai/concepts/use-your-data) support isn't available for MaaS deployments in private hubs, because private hubs have the public network access flag disabled.
171
171
172
-
* Any network configuration change (for example, enabling or disabling the PNA flag) might take up to five minutes to propagate.
172
+
* Any network configuration change (for example, enabling or disabling the public network access flag) might take up to five minutes to propagate.
0 commit comments