Skip to content

Commit 59ec4d5

Browse files
Merge pull request #299720 from craigshoemaker/aca/foundry-models
[Container Apps] Update: GPU overview -> add Foundry model overview
2 parents c90d34a + ac5ed06 commit 59ec4d5

File tree

1 file changed

+34
-0
lines changed

1 file changed

+34
-0
lines changed

articles/container-apps/gpu-serverless-overview.md

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -99,6 +99,40 @@ You can significantly improve cold start times by enabling artifact streaming an
9999

100100
- [Storage mounts](cold-start.md#manage-large-downloads): Reduce the effects of network latency by storing large files in an Azure storage account associated with your container app.
101101

102+
<a name="deploy-foundry-models"></a>
103+
104+
## Deploy Foundry models to serverless GPUs (preview)
105+
106+
Azure Container Apps serverless GPUs now support Azure AI Foundry models in public preview. Azure AI Foundry Models have two deployment options:
107+
108+
- [**Serverless APIs**](/azure/ai-foundry/how-to/deploy-models-serverless?tabs=azure-ai-studio) which provide pay-as-you-go billing for some of the most popular models.
109+
110+
- [**Managed compute**](/azure/ai-foundry/how-to/create-manage-compute) that allow you to deploy the full selection of Foundry models with pay-per-GPU pricing.
111+
112+
Azure Container Apps serverless GPU offers a balanced deployment option between serverless APIs and managed compute for you to deploy Foundry models. This option is on-demand with serverless scaling that scales in to zero when not in use and complies with your data residency needs. With serverless GPUs, using Foundry models give you flexibility to run any supported model with automatic scaling, pay-per-second-pricing, full data governance, out of the box enterprise networking and security support.
113+
114+
Language models of the type `MLFLOW` are supported. To see a list of `MLFLOW` models, go to the list of models available in the [azureml registry](https://aka.ms/azureml-registry). To locate the models, add a filter for `MLFLOW` models using the following steps:
115+
116+
1. Select **Filter**.
117+
118+
1. Select **Add Filter**.
119+
120+
1. For the filter rule, enter **Type = MLFLOW**.
121+
122+
The following CLI command shows how to deploy a Foundry model to serverless GPUs:
123+
124+
```azurecli
125+
az containerapp up \
126+
--name <CONTAINER_APP_NAME> \
127+
--environment <ENVIORNMENT_NAME> \
128+
--resource-group <RESOURCE_GROUP_NAME> \
129+
--model-registry <MODEL_REGISTRY_NAME> \
130+
--model-name <MODEL_NAME> \
131+
--model-version <MODEL_VERSION>
132+
```
133+
134+
When you deploy a Foundry model to Azure Container Apps serverless GPUs as an online endpoint, a scoring script is required. The scoring script (named *score.py*) defines how you interact with the model. By default, the example CLI command provides a scoring script. However, you can also provide your own *score.py* file. The following example shows [how to use a custom score.py file](/azure/machine-learning/how-to-deploy-online-endpoints?view=azureml-api-2&tabs=cli).
135+
102136
## Submit feedback
103137

104138
Submit issue to the [Azure Container Apps GitHub repo](https://github.com/microsoft/azure-container-apps).

0 commit comments

Comments
 (0)