diff --git a/README.md b/README.md index 496d737ade..350772fd55 100644 --- a/README.md +++ b/README.md @@ -84,7 +84,7 @@ The repo includes sample data so it's ready to try end to end. In this sample ap Pricing varies per region and usage, so it isn't possible to predict exact costs for your usage. However, you can try the [Azure pricing calculator](https://azure.com/e/e3490de2372a4f9b909b0d032560e41b) for the resources below. -- Azure Container Apps: Default host for app deployment as of 10/28/2024. See more details in [the ACA deployment guide](docs/azure_container_apps.md). Consumption plan with 1 CPU core, 2.0 GB RAM. Pricing with Pay-as-You-Go. [Pricing](https://azure.microsoft.com/pricing/details/container-apps/) +- Azure Container Apps: Default host for app deployment as of 10/28/2024. See more details in [the ACA deployment guide](docs/azure_container_apps.md). Consumption plan with 1 CPU core, 2 GB RAM, minimum of 0 replicas. Pricing with Pay-as-You-Go. [Pricing](https://azure.microsoft.com/pricing/details/container-apps/) - Azure Container Registry: Basic tier. [Pricing](https://azure.microsoft.com/pricing/details/container-registry/) - Azure App Service: Only provisioned if you deploy to Azure App Service following [the App Service deployment guide](docs/azure_app_service.md). Basic Tier with 1 CPU core, 1.75 GB RAM. Pricing per hour. [Pricing](https://azure.microsoft.com/pricing/details/app-service/linux/) - Azure OpenAI: Standard tier, GPT and Ada models. Pricing per 1K tokens used, and at least 1K tokens are used per question. [Pricing](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/) diff --git a/docs/productionizing.md b/docs/productionizing.md index 2d99dd893c..8d8ad2e54a 100644 --- a/docs/productionizing.md +++ b/docs/productionizing.md @@ -75,6 +75,15 @@ We recommend using a Premium level SKU, starting with 1 CPU core. You can use auto-scaling rules or scheduled scaling rules, and scale up the maximum/minimum based on load. +### Azure Container Apps + +The default container app uses a "Consumption" workload profile with 1 CPU core and 2 GB RAM, +and scaling rules that allow for scaling all the way down to 0 replicas when idle. +For production, consider either increasing the CPU cores and memory or +[switching to a "Dedicated" workload profile](azure_container_apps.md#customizing-workload-profile), +and configure the scaling rules to keep at least two replicas running at all times. +Learn more in the [Azure Container Apps documentation](https://learn.microsoft.com/azure/container-apps). + ## Additional security measures * **Authentication**: By default, the deployed app is publicly accessible. diff --git a/infra/core/host/container-app-upsert.bicep b/infra/core/host/container-app-upsert.bicep index aa162c3ade..8b3918df2a 100644 --- a/infra/core/host/container-app-upsert.bicep +++ b/infra/core/host/container-app-upsert.bicep @@ -14,8 +14,7 @@ param containerMaxReplicas int = 10 @description('The amount of memory allocated to a single container instance, e.g., 1Gi') param containerMemory string = '1.0Gi' -@description('The minimum number of replicas to run. Must be at least 1.') -@minValue(1) +@description('The minimum number of replicas to run. Must be at least 1 for non-consumption workloads.') param containerMinReplicas int = 1 @description('The name of the container') diff --git a/infra/main.bicep b/infra/main.bicep index b268ea36b7..551e44708d 100644 --- a/infra/main.bicep +++ b/infra/main.bicep @@ -517,6 +517,7 @@ module acaBackend 'core/host/container-app-upsert.bicep' = if (deploymentTarget targetPort: 8000 containerCpuCoreCount: '1.0' containerMemory: '2Gi' + containerMinReplicas: 0 allowedOrigins: allowedOrigins env: union(appEnvVariables, { // For using managed identity to access Azure resources. See https://github.com/microsoft/azure-container-apps/issues/442