You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/container-apps/gpu-serverless-overview.md
+34Lines changed: 34 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -99,6 +99,40 @@ You can significantly improve cold start times by enabling artifact streaming an
99
99
100
100
-[Storage mounts](cold-start.md#manage-large-downloads): Reduce the effects of network latency by storing large files in an Azure storage account associated with your container app.
101
101
102
+
<aname="deploy-foundry-models"></a>
103
+
104
+
## Deploy Foundry models to serverless GPUs (preview)
105
+
106
+
Azure Container Apps serverless GPUs now support Azure AI Foundry models in public preview. Azure AI Foundry Models have two deployment options:
107
+
108
+
-[**Serverless APIs**](/azure/ai-foundry/how-to/deploy-models-serverless?tabs=azure-ai-studio) which provide pay-as-you-go billing for some of the most popular models.
109
+
110
+
-[**Managed compute**](/azure/ai-foundry/how-to/create-manage-compute) that allow you to deploy the full selection of Foundry models with pay-per-GPU pricing.
111
+
112
+
Azure Container Apps serverless GPU offers a balanced deployment option between serverless APIs and managed compute for you to deploy Foundry models. This option is on-demand with serverless scaling that scales in to zero when not in use and complies with your data residency needs. With serverless GPUs, using Foundry models give you flexibility to run any supported model with automatic scaling, pay-per-second-pricing, full data governance, out of the box enterprise networking and security support.
113
+
114
+
Language models of the type `MLFLOW` are supported. To see a list of `MLFLOW` models, go to the list of models available in the [azureml registry](https://aka.ms/azureml-registry). To locate the models, add a filter for `MLFLOW` models using the following steps:
115
+
116
+
1. Select **Filter**.
117
+
118
+
1. Select **Add Filter**.
119
+
120
+
1. For the filter rule, enter **Type = MLFLOW**.
121
+
122
+
The following CLI command shows how to deploy a Foundry model to serverless GPUs:
123
+
124
+
```azurecli
125
+
az containerapp up \
126
+
--name <CONTAINER_APP_NAME> \
127
+
--environment <ENVIORNMENT_NAME> \
128
+
--resource-group <RESOURCE_GROUP_NAME> \
129
+
--model-registry <MODEL_REGISTRY_NAME> \
130
+
--model-name <MODEL_NAME> \
131
+
--model-version <MODEL_VERSION>
132
+
```
133
+
134
+
When you deploy a Foundry model to Azure Container Apps serverless GPUs as an online endpoint, a scoring script is required. The scoring script (named *score.py*) defines how you interact with the model. By default, the example CLI command provides a scoring script. However, you can also provide your own *score.py* file. The following example shows [how to use a custom score.py file](/azure/machine-learning/how-to-deploy-online-endpoints?view=azureml-api-2&tabs=cli).
135
+
102
136
## Submit feedback
103
137
104
138
Submit issue to the [Azure Container Apps GitHub repo](https://github.com/microsoft/azure-container-apps).
0 commit comments