included gpt-4.1

Harmanpreet Kaur · Harmanpreet Kaur · commit d53ad79a906b · 2025-06-04T19:40:29.000+05:30
diff --git a/.github/workflows/deploy.yml b/.github/workflows/deploy.yml
@@ -120,7 +120,7 @@ jobs:
                 environmentName="${{ env.SOLUTION_PREFIX }}" \
                 secondaryLocation="northcentralus" \
                 deploymentType="GlobalStandard" \
-                gptModelName="gpt-4o" \
+                gptModelName="gpt-4.1" \
                 azureOpenaiAPIVersion="2024-05-01-preview" \
                 gptDeploymentCapacity=${{ env.GPT_MIN_CAPACITY }} \
                 embeddingModel="text-embedding-ada-002" \
diff --git a/README.md b/README.md
@@ -99,7 +99,7 @@ _Note: This is not meant to outline all costs as selected SKUs, scaled use, cust
 | [Azure AI Search](https://learn.microsoft.com/en-us/azure/search/) | Standard tier, S1. Pricing is based on the number of documents and operations. Information retrieval at scale for vector and text content in traditional or generative search scenarios. | [Pricing](https://azure.microsoft.com/pricing/details/search/) |
 | [Azure Storage Account](https://learn.microsoft.com/en-us/azure/storage/blobs/) | Standard tier, LRS. Pricing is based on storage and operations. Blob storage in the clopud, optimized for storing massive amounts of unstructured data. | [Pricing](https://azure.microsoft.com/pricing/details/storage/blobs/) |
 | [Azure Key Vault](https://learn.microsoft.com/en-us/azure/key-vault/) | Standard tier. Pricing is based on the number of operations. Maintain keys that access and encrypt your cloud resources, apps, and solutions. | [Pricing](https://azure.microsoft.com/pricing/details/key-vault/) |
-| [Azure AI Services](https://learn.microsoft.com/en-us/azure/ai-services/) | S0 tier, defaults to gpt-4o and text-embedding-ada-002 models. Pricing is based on token count. | [Pricing](https://azure.microsoft.com/pricing/details/cognitive-services/) |
+| [Azure AI Services](https://learn.microsoft.com/en-us/azure/ai-services/) | S0 tier, defaults to gpt-4.1 and text-embedding-ada-002 models. Pricing is based on token count. | [Pricing](https://azure.microsoft.com/pricing/details/cognitive-services/) |
 | [Azure Container App](https://learn.microsoft.com/en-us/azure/container-apps/) | Consumption tier with 0.5 CPU, 1GiB memory/storage. Pricing is based on resource allocation, and each month allows for a certain amount of free usage. Allows you to run containerized applications without worrying about orchestration or infrastructure. | [Pricing](https://azure.microsoft.com/pricing/details/container-apps/) |
 | [Azure Container Registry](https://learn.microsoft.com/en-us/azure/container-registry/) | Basic tier. Build, store, and manage container images and artifacts in a private registry for all types of container deployments | [Pricing](https://azure.microsoft.com/pricing/details/container-registry/) |
 | [Log analytics](https://learn.microsoft.com/en-us/azure/azure-monitor/) | Pay-as-you-go tier. Costs based on data ingested. Collect and analyze on telemetry data generated by Azure. | [Pricing](https://azure.microsoft.com/pricing/details/monitor/) |
diff --git a/docs/AzureGPTQuotaSettings.md b/docs/AzureGPTQuotaSettings.md
@@ -5,6 +5,6 @@
 3. **Go to** the `Management Center` from the bottom-left navigation menu.  
 4. Select `Quota`  
    - Click on the `GlobalStandard` dropdown.  
-   - Select the required **GPT model** (`GPT-4, GPT-4o`) or **Embeddings model** (`text-embedding-ada-002`).  
+   - Select the required **GPT model** (`GPT-4, GPT-4o,GPT-4.1`) or **Embeddings model** (`text-embedding-ada-002`).  
    - Choose the **region** where the deployment is hosted.  
 5. Request More Quota or delete any unused model deployments as needed.  
diff --git a/docs/CustomizingAzdParameters.md b/docs/CustomizingAzdParameters.md
@@ -18,10 +18,10 @@ Change the Model Deployment Type (allowed values: Standard, GlobalStandard)
 azd env set AZURE_ENV_MODEL_DEPLOYMENT_TYPE Standard
 ```
 
-Set the Model Name (allowed values: gpt-4, gpt-4o)
+Set the Model Name (allowed values: gpt-4, gpt-4o,gpt-4.1)
 
 ```shell
-azd env set AZURE_ENV_MODEL_NAME gpt-4o
+azd env set AZURE_ENV_MODEL_NAME gpt-4.1
 ```
 
 Change the Model Capacity (choose a number based on available GPT model capacity in your subscription)
diff --git a/docs/DeploymentGuide.md b/docs/DeploymentGuide.md
@@ -103,7 +103,7 @@ When you start the deployment, most parameters will have **default values**, but
 | **Environment Name** | A **3-20 character alphanumeric value** used to generate a unique ID to prefix the resources. |  byctemplate |
 | **Secondary Location** | A **less busy** region for **CosmosDB**, useful in case of availability constraints. |  eastus2 |
 | **Deployment Type** | Select from a drop-down list. |  Global Standard |
-| **GPT Model** | Choose from **gpt-4, gpt-4o** | gpt-4o |  
+| **GPT Model** | Choose from **gpt-4, gpt-4o , gpt-4.1** | gpt-4.1 |  
 | **GPT Model Deployment Capacity** | Configure capacity for **GPT models**. | 30k |
 | **Embedding Model** | Default: **text-embedding-ada-002**. |  text-embedding-ada-002 |
 | **Embedding Model Capacity** | Set the capacity for **embedding models**. |  80k |
@@ -114,8 +114,9 @@ When you start the deployment, most parameters will have **default values**, but
 <details>
   <summary><b>[Optional] Quota Recommendations</b></summary>
 
-By default, the _Gpt-4o model capacity_ in deployment is set to _30k tokens_, so we recommend:
-- **For Global Standard | GPT-4o** - the capacity to at least 150k tokens post-deployment for optimal performance.
+By default, the _Gpt-4.1 model capacity_ in deployment is set to _30k tokens_, so we recommend:
+- **For Global Standard | GPT-4.1** - the capacity to at least 150k tokens post-deployment for optimal performance.
+- **For Global Standard | GPT-4.0** - the capacity to at least 150k tokens post-deployment for optimal performance.
 
 - **For Standard | GPT-4** - ensure a minimum of 30k–40k tokens for best results.
 
diff --git a/docs/QuotaCheck.md b/docs/QuotaCheck.md
@@ -1,7 +1,8 @@
 ## Check Quota Availability Before Deployment
 
 Before deploying the accelerator, **ensure sufficient quota availability** for the required model.
-> **For Global Standard | GPT-4o - the capacity to at least 150k tokens post-deployment for optimal performance.**
+
+> **For Global Standard | GPT-4o |GPT-4.1- the capacity to at least 150k tokens post-deployment for optimal performance.**
 
 > **For Standard | GPT-4 - ensure a minimum of 30k–40k tokens for best results.**
 
@@ -13,7 +14,7 @@ azd auth login
 
 ### 📌 Default Models & Capacities:
 ```
-gpt-4o:30, text-embedding-ada-002:80, gpt-4:30
+gpt-4.1:30, text-embedding-ada-002:80, gpt-4:30, gpt-4o:30
 ```
 ### 📌 Default Regions:
 ```
diff --git a/infra/main.bicep b/infra/main.bicep
@@ -26,10 +26,10 @@ param secondaryLocation string
 param deploymentType string = 'GlobalStandard'
 
 @description('Name of the GPT model to deploy:')
-param gptModelName string = 'gpt-4o'
+param gptModelName string = 'gpt-4.1'
 
 @description('Version of the GPT model to deploy:')
-param gptModelVersion string = '2024-05-13'
+param gptModelVersion string = '2025-04-14'
 
 param azureOpenaiAPIVersion string = '2024-05-01-preview'
 
diff --git a/infra/main.json b/infra/main.json
@@ -5,7 +5,7 @@
     "_generator": {
       "name": "bicep",
       "version": "0.36.1.42791",
-      "templateHash": "11172828768806624864"
+      "templateHash": "5449809042324258772"
     }
   },
   "parameters": {
@@ -41,14 +41,14 @@
     },
     "gptModelName": {
       "type": "string",
-      "defaultValue": "gpt-4o",
+      "defaultValue": "gpt-4.1",
       "metadata": {
         "description": "Name of the GPT model to deploy:"
       }
     },
     "gptModelVersion": {
       "type": "string",
-      "defaultValue": "2024-05-13",
+      "defaultValue": "2025-04-14",
       "metadata": {
         "description": "Version of the GPT model to deploy:"
       }
diff --git a/scripts/checkquota.sh b/scripts/checkquota.sh
@@ -32,7 +32,7 @@ echo "✅ Azure subscription set successfully."
 
 # Define models and their minimum required capacities
 declare -A MIN_CAPACITY=(
-    ["OpenAI.Standard.gpt-4o"]=$GPT_MIN_CAPACITY
+    ["OpenAI.Standard.gpt-4.1"]=$GPT_MIN_CAPACITY
     ["OpenAI.Standard.text-embedding-ada-002"]=$TEXT_EMBEDDING_MIN_CAPACITY
 )
 
diff --git a/scripts/quota_check_params.sh b/scripts/quota_check_params.sh
@@ -47,7 +47,7 @@ log_verbose() {
 }
 
 # Default Models and Capacities (Comma-separated in "model:capacity" format)
-DEFAULT_MODEL_CAPACITY="gpt-4o:30,text-embedding-ada-002:80,gpt-4:30"
+DEFAULT_MODEL_CAPACITY="gpt-4.1:30,text-embedding-ada-002:80,gpt-4:30,gpt-4o:30"
 
 # Convert the comma-separated string into an array
 IFS=',' read -r -a MODEL_CAPACITY_PAIRS <<< "$DEFAULT_MODEL_CAPACITY"

Original file line number	Diff line number	Diff line change
`@@ -32,7 +32,7 @@ echo "✅ Azure subscription set successfully."`
`32`	`32`
`33`	`33`	`# Define models and their minimum required capacities`
`34`	`34`	`declare -A MIN_CAPACITY=(`
`35`		`- ["OpenAI.Standard.gpt-4o"]=$GPT_MIN_CAPACITY`
	`35`	`+ ["OpenAI.Standard.gpt-4.1"]=$GPT_MIN_CAPACITY`
`36`	`36`	`["OpenAI.Standard.text-embedding-ada-002"]=$TEXT_EMBEDDING_MIN_CAPACITY`
`37`	`37`	`)`
`38`	`38`
Original file line number	Diff line number	Diff line change
`@@ -47,7 +47,7 @@ log_verbose() {`
`47`	`47`	`}`
`48`	`48`
`49`	`49`	`# Default Models and Capacities (Comma-separated in "model:capacity" format)`
`50`		`-DEFAULT_MODEL_CAPACITY="gpt-4o:30,text-embedding-ada-002:80,gpt-4:30"`
	`50`	`+DEFAULT_MODEL_CAPACITY="gpt-4.1:30,text-embedding-ada-002:80,gpt-4:30,gpt-4o:30"`
`51`	`51`
`52`	`52`	`# Convert the comma-separated string into an array`
`53`	`53`	`IFS=',' read -r -a MODEL_CAPACITY_PAIRS <<< "$DEFAULT_MODEL_CAPACITY"`