Merge pull request #299720 from craigshoemaker/aca/foundry-models

prmerger-automator[bot] · web-flow · commit 59ec4d59c72a · 2025-05-12T21:38:46.000Z
[Container Apps] Update: GPU overview -&gt; add Foundry model overview
diff --git a/articles/container-apps/gpu-serverless-overview.md b/articles/container-apps/gpu-serverless-overview.md
@@ -99,6 +99,40 @@ You can significantly improve cold start times by enabling artifact streaming an
 
 - [Storage mounts](cold-start.md#manage-large-downloads): Reduce the effects of network latency by storing large files in an Azure storage account associated with your container app.
 
+<a name="deploy-foundry-models"></a>
+
+## Deploy Foundry models to serverless GPUs (preview)
+
+Azure Container Apps serverless GPUs now support Azure AI Foundry models in public preview. Azure AI Foundry Models have two deployment options:
+
+- [**Serverless APIs**](/azure/ai-foundry/how-to/deploy-models-serverless?tabs=azure-ai-studio) which provide pay-as-you-go billing for some of the most popular models.
+
+- [**Managed compute**](/azure/ai-foundry/how-to/create-manage-compute) that allow you to deploy the full selection of Foundry models with pay-per-GPU pricing.
+
+Azure Container Apps serverless GPU offers a balanced deployment option between serverless APIs and managed compute for you to deploy Foundry models. This option is on-demand with serverless scaling that scales in to zero when not in use and complies with your data residency needs. With serverless GPUs, using Foundry models give you flexibility to run any supported model with automatic scaling, pay-per-second-pricing, full data governance, out of the box enterprise networking and security support.
+
+Language models of the type `MLFLOW` are supported. To see a list of `MLFLOW` models, go to the list of models available in the [azureml registry](https://aka.ms/azureml-registry). To locate the models, add a filter for `MLFLOW` models using the following steps:
+
+1. Select **Filter**.
+
+1. Select **Add Filter**.
+
+1. For the filter rule, enter **Type = MLFLOW**.
+
+The following CLI command shows how to deploy a Foundry model to serverless GPUs:
+
+```azurecli
+az containerapp up \
+  --name <CONTAINER_APP_NAME> \
+  --environment <ENVIORNMENT_NAME> \
+  --resource-group <RESOURCE_GROUP_NAME> \
+  --model-registry <MODEL_REGISTRY_NAME> \
+  --model-name <MODEL_NAME> \
+  --model-version <MODEL_VERSION>
+```
+
+When you deploy a Foundry model to Azure Container Apps serverless GPUs as an online endpoint, a scoring script is required. The scoring script (named *score.py*) defines how you interact with the model. By default, the example CLI command provides a scoring script. However, you can also provide your own *score.py* file. The following example shows [how to use a custom score.py file](/azure/machine-learning/how-to-deploy-online-endpoints?view=azureml-api-2&tabs=cli).
+
 ## Submit feedback
 
 Submit issue to the [Azure Container Apps GitHub repo](https://github.com/microsoft/azure-container-apps).