Skip to content

Commit b690929

Browse files
Merge pull request #458 from microsoft/psl-gpt4.1update
chore: update GPT model config to gpt4.1
2 parents 0c8dd4b + 4d1b7d5 commit b690929

File tree

12 files changed

+45
-36
lines changed

12 files changed

+45
-36
lines changed

.github/workflows/deploy.yml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@ on:
99
- main
1010
- dev
1111
- demo
12+
1213
schedule:
1314
- cron: '0 9,21 * * *' # Runs at 9:00 AM and 9:00 PM GMT
1415

@@ -142,7 +143,7 @@ jobs:
142143
environmentName="${{ env.SOLUTION_PREFIX }}" \
143144
secondaryLocation="northcentralus" \
144145
deploymentType="GlobalStandard" \
145-
gptModelName="gpt-4o" \
146+
gptModelName="gpt-4.1" \
146147
azureOpenaiAPIVersion="2024-05-01-preview" \
147148
gptDeploymentCapacity=${{ env.GPT_MIN_CAPACITY }} \
148149
embeddingModel="text-embedding-ada-002" \

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -99,7 +99,7 @@ _Note: This is not meant to outline all costs as selected SKUs, scaled use, cust
9999
| [Azure AI Search](https://learn.microsoft.com/en-us/azure/search/) | Standard tier, S1. Pricing is based on the number of documents and operations. Information retrieval at scale for vector and text content in traditional or generative search scenarios. | [Pricing](https://azure.microsoft.com/pricing/details/search/) |
100100
| [Azure Storage Account](https://learn.microsoft.com/en-us/azure/storage/blobs/) | Standard tier, LRS. Pricing is based on storage and operations. Blob storage in the clopud, optimized for storing massive amounts of unstructured data. | [Pricing](https://azure.microsoft.com/pricing/details/storage/blobs/) |
101101
| [Azure Key Vault](https://learn.microsoft.com/en-us/azure/key-vault/) | Standard tier. Pricing is based on the number of operations. Maintain keys that access and encrypt your cloud resources, apps, and solutions. | [Pricing](https://azure.microsoft.com/pricing/details/key-vault/) |
102-
| [Azure AI Services](https://learn.microsoft.com/en-us/azure/ai-services/) | S0 tier, defaults to gpt-4o and text-embedding-ada-002 models. Pricing is based on token count. | [Pricing](https://azure.microsoft.com/pricing/details/cognitive-services/) |
102+
| [Azure AI Services](https://learn.microsoft.com/en-us/azure/ai-services/) | S0 tier, defaults to gpt-4.1 and text-embedding-ada-002 models. Pricing is based on token count. | [Pricing](https://azure.microsoft.com/pricing/details/cognitive-services/) |
103103
| [Azure Container App](https://learn.microsoft.com/en-us/azure/container-apps/) | Consumption tier with 0.5 CPU, 1GiB memory/storage. Pricing is based on resource allocation, and each month allows for a certain amount of free usage. Allows you to run containerized applications without worrying about orchestration or infrastructure. | [Pricing](https://azure.microsoft.com/pricing/details/container-apps/) |
104104
| [Azure Container Registry](https://learn.microsoft.com/en-us/azure/container-registry/) | Basic tier. Build, store, and manage container images and artifacts in a private registry for all types of container deployments | [Pricing](https://azure.microsoft.com/pricing/details/container-registry/) |
105105
| [Log analytics](https://learn.microsoft.com/en-us/azure/azure-monitor/) | Pay-as-you-go tier. Costs based on data ingested. Collect and analyze on telemetry data generated by Azure. | [Pricing](https://azure.microsoft.com/pricing/details/monitor/) |

docs/AzureGPTQuotaSettings.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,6 @@
55
3. **Go to** the `Management Center` from the bottom-left navigation menu.
66
4. Select `Quota`
77
- Click on the `GlobalStandard` dropdown.
8-
- Select the required **GPT model** (`GPT-4, GPT-4o`) or **Embeddings model** (`text-embedding-ada-002`).
8+
- Select the required **GPT model** (`GPT-4.1`) or **Embeddings model** (`text-embedding-ada-002`).
99
- Choose the **region** where the deployment is hosted.
1010
5. Request More Quota or delete any unused model deployments as needed.

docs/CustomizingAzdParameters.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ By default this template will use the environment name as the prefix to prevent
1313
| `AZURE_ENV_NAME` | string | `docgen` | Sets the environment name prefix for all Azure resources. |
1414
| `AZURE_ENV_SECONDARY_LOCATION` | string | `eastus2` | Specifies a secondary Azure region. |
1515
| `AZURE_ENV_MODEL_DEPLOYMENT_TYPE` | string | `Standard` | Defines the model deployment type (allowed: `Standard`, `GlobalStandard`). |
16-
| `AZURE_ENV_MODEL_NAME` | string | `gpt-4o` | Specifies the GPT model name (allowed: `gpt-4`, `gpt-4o`). |
16+
| `AZURE_ENV_MODEL_NAME` | string | `gpt-4.1` | Specifies the GPT model name (allowed: `gpt-4`, `gpt-4o`). |
1717
| `AZURE_ENV_MODEL_VERSION` | string | `2024-05-13` | Set the Azure model version (allowed values: `2024-08-06`). |
1818
| `AZURE_ENV_OPENAI_API_VERSION` | string | `2024-05-01-preview` | Specifies the API version for Azure OpenAI. |
1919
| `AZURE_ENV_MODEL_CAPACITY` | integer | `30` | Sets the GPT model capacity (based on what's available in your subscription). |
@@ -23,8 +23,10 @@ By default this template will use the environment name as the prefix to prevent
2323
| `AZURE_ENV_LOG_ANALYTICS_WORKSPACE_ID` | string | `<Existing Workspace Id>` | Reuses an existing Log Analytics Workspace instead of creating a new one. |
2424

2525

26+
2627
## How to Set a Parameter
2728

29+
2830
To customize any of the above values, run the following command **before** `azd up`:
2931

3032
```bash

docs/DeploymentGuide.md

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -105,8 +105,8 @@ When you start the deployment, most parameters will have **default values**, but
105105
| **Environment Name** | A **3–20 character alphanumeric** value used to generate a unique ID to prefix the resources. | `byctemplate` |
106106
| **Secondary Location** | A **less busy** region for **CosmosDB**, useful in case of availability constraints. | `eastus2` |
107107
| **Deployment Type** | Model deployment type (allowed: `Standard`, `GlobalStandard`). | `GlobalStandard` |
108-
| **GPT Model** | Choose from **gpt-4**, **gpt-4o**. | `gpt-4o` |
109-
| **GPT Model Version** | Version of the GPT model to use (e.g., `2024-08-06`). | `2024-05-13` |
108+
| **GPT Model** | The GPT model used by the app | `gpt-4.1` |
109+
| **GPT Model Version** | The GPT Version used by the app | `2024-05-13` |
110110
| **OpenAI API Version** | Azure OpenAI API version used for deployments. | `2024-05-01-preview` |
111111
| **GPT Model Deployment Capacity** | Configure the capacity for **GPT model deployments** (in thousands). | `30k` |
112112
| **Embedding Model** | The embedding model used by the app. | `text-embedding-ada-002` |
@@ -115,13 +115,14 @@ When you start the deployment, most parameters will have **default values**, but
115115
| **Existing Log Analytics Workspace** | If reusing a Log Analytics Workspace, specify the ID. | *(none)* |
116116

117117

118+
118119
</details>
119120

120121
<details>
121122
<summary><b>[Optional] Quota Recommendations</b></summary>
122123

123-
By default, the _Gpt-4o model capacity_ in deployment is set to _30k tokens_, so we recommend:
124-
- **For Global Standard | GPT-4o** - the capacity to at least 150k tokens post-deployment for optimal performance.
124+
By default, the _Gpt-4.1 model capacity_ in deployment is set to _30k tokens_, so we recommend:
125+
- **For Global Standard | GPT-4.1** - the capacity to at least 150k tokens post-deployment for optimal performance.
125126

126127
- **For Standard | GPT-4** - ensure a minimum of 30k–40k tokens for best results.
127128

docs/QuotaCheck.md

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,8 @@
11
## Check Quota Availability Before Deployment
22

33
Before deploying the accelerator, **ensure sufficient quota availability** for the required model.
4-
> **For Global Standard | GPT-4o - the capacity to at least 150k tokens post-deployment for optimal performance.**
4+
5+
> **For Global Standard |GPT-4.1- the capacity to at least 150k tokens post-deployment for optimal performance.**
56
67
> **For Standard | GPT-4 - ensure a minimum of 30k–40k tokens for best results.**
78
@@ -13,7 +14,7 @@ azd auth login
1314

1415
### 📌 Default Models & Capacities:
1516
```
16-
gpt-4o:30, text-embedding-ada-002:80, gpt-4:30
17+
gpt-4.1:30, text-embedding-ada-002:80, gpt-4:30
1718
```
1819
### 📌 Default Regions:
1920
```
@@ -39,15 +40,15 @@ eastus, uksouth, eastus2, northcentralus, swedencentral, westus, westus2, southc
3940
```
4041
✔️ Check specific model(s) in default regions:
4142
```
42-
./quota_check_params.sh --models gpt-4o:30,text-embedding-ada-002:80
43+
./quota_check_params.sh --models gpt-4.1:30,text-embedding-ada-002:80
4344
```
4445
✔️ Check default models in specific region(s):
4546
```
4647
./quota_check_params.sh --regions eastus,westus
4748
```
4849
✔️ Passing Both models and regions:
4950
```
50-
./quota_check_params.sh --models gpt-4o:30 --regions eastus,westus2
51+
./quota_check_params.sh --models gpt-4.1:30 --regions eastus,westus2
5152
```
5253
✔️ All parameters combined:
5354
```

infra/main.bicep

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -26,10 +26,10 @@ param secondaryLocation string
2626
param deploymentType string = 'GlobalStandard'
2727

2828
@description('Name of the GPT model to deploy:')
29-
param gptModelName string = 'gpt-4o'
29+
param gptModelName string = 'gpt-4.1'
3030

3131
@description('Version of the GPT model to deploy:')
32-
param gptModelVersion string = '2024-05-13'
32+
param gptModelVersion string = '2025-04-14'
3333

3434
param azureOpenaiAPIVersion string = '2024-05-01-preview'
3535

@@ -385,7 +385,7 @@ module appserviceModule 'deploy_app_service.bicep' = {
385385
aiSearchService: aifoundry.outputs.aiSearchService
386386
AzureSearchKey: keyVault.getSecret('AZURE-SEARCH-KEY')
387387
AzureOpenAIEndpoint:aifoundry.outputs.aiServicesTarget
388-
AzureOpenAIModel: gptModelName //'gpt-4o-mini'
388+
AzureOpenAIModel: gptModelName
389389
AzureOpenAIKey:keyVault.getSecret('AZURE-OPENAI-KEY')
390390
azureOpenAIApiVersion: azureOpenaiAPIVersion //'2024-02-15-preview'
391391
AZURE_OPENAI_RESOURCE:aifoundry.outputs.aiServicesName

infra/main.bicepparam

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,9 +4,11 @@ param AZURE_LOCATION = readEnvironmentVariable('AZURE_LOCATION', '')
44
param environmentName = readEnvironmentVariable('AZURE_ENV_NAME', 'env_name')
55
param secondaryLocation = readEnvironmentVariable('AZURE_ENV_SECONDARY_LOCATION', 'eastus2')
66
param deploymentType = readEnvironmentVariable('AZURE_ENV_MODEL_DEPLOYMENT_TYPE', 'GlobalStandard')
7-
param gptModelName = readEnvironmentVariable('AZURE_ENV_MODEL_NAME', 'gpt-4o')
8-
param gptModelVersion = readEnvironmentVariable('AZURE_ENV_MODEL_VERSION', '2024-05-13')
7+
8+
param gptModelName = readEnvironmentVariable('AZURE_ENV_MODEL_NAME', 'gpt-4.1')
9+
param gptModelVersion = readEnvironmentVariable('AZURE_ENV_MODEL_VERSION', '2025-04-14')
910
param azureOpenaiAPIVersion = readEnvironmentVariable('AZURE_ENV_OPENAI_API_VERSION', '2024-05-01-preview')
11+
1012
param gptDeploymentCapacity = int(readEnvironmentVariable('AZURE_ENV_MODEL_CAPACITY', '30'))
1113
param embeddingModel = readEnvironmentVariable('AZURE_ENV_EMBEDDING_MODEL_NAME', 'text-embedding-ada-002')
1214
param imageTag = readEnvironmentVariable('AZURE_ENV_IMAGETAG', 'latest')

infra/main.json

Lines changed: 16 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -4,8 +4,8 @@
44
"metadata": {
55
"_generator": {
66
"name": "bicep",
7-
"version": "0.35.1.17967",
8-
"templateHash": "3433053339326968482"
7+
"version": "0.36.1.42791",
8+
"templateHash": "5449809042324258772"
99
}
1010
},
1111
"parameters": {
@@ -41,14 +41,14 @@
4141
},
4242
"gptModelName": {
4343
"type": "string",
44-
"defaultValue": "gpt-4o",
44+
"defaultValue": "gpt-4.1",
4545
"metadata": {
4646
"description": "Name of the GPT model to deploy:"
4747
}
4848
},
4949
"gptModelVersion": {
5050
"type": "string",
51-
"defaultValue": "2024-05-13",
51+
"defaultValue": "2025-04-14",
5252
"metadata": {
5353
"description": "Version of the GPT model to deploy:"
5454
}
@@ -361,8 +361,8 @@
361361
"metadata": {
362362
"_generator": {
363363
"name": "bicep",
364-
"version": "0.35.1.17967",
365-
"templateHash": "14416829741819681429"
364+
"version": "0.36.1.42791",
365+
"templateHash": "8965508470098961595"
366366
}
367367
},
368368
"parameters": {
@@ -456,8 +456,8 @@
456456
"metadata": {
457457
"_generator": {
458458
"name": "bicep",
459-
"version": "0.35.1.17967",
460-
"templateHash": "14711167186840027914"
459+
"version": "0.36.1.42791",
460+
"templateHash": "15511025830087119739"
461461
}
462462
},
463463
"parameters": {
@@ -601,8 +601,8 @@
601601
"metadata": {
602602
"_generator": {
603603
"name": "bicep",
604-
"version": "0.35.1.17967",
605-
"templateHash": "3118038315112495212"
604+
"version": "0.36.1.42791",
605+
"templateHash": "8750828267619251070"
606606
}
607607
},
608608
"parameters": {
@@ -1448,8 +1448,8 @@
14481448
"metadata": {
14491449
"_generator": {
14501450
"name": "bicep",
1451-
"version": "0.35.1.17967",
1452-
"templateHash": "12684246002053954621"
1451+
"version": "0.36.1.42791",
1452+
"templateHash": "11115444345720629816"
14531453
}
14541454
},
14551455
"parameters": {
@@ -1688,8 +1688,8 @@
16881688
"metadata": {
16891689
"_generator": {
16901690
"name": "bicep",
1691-
"version": "0.35.1.17967",
1692-
"templateHash": "16988932665267526316"
1691+
"version": "0.36.1.42791",
1692+
"templateHash": "9597436405986955034"
16931693
}
16941694
},
16951695
"parameters": {
@@ -2191,8 +2191,8 @@
21912191
"metadata": {
21922192
"_generator": {
21932193
"name": "bicep",
2194-
"version": "0.35.1.17967",
2195-
"templateHash": "12799194170352887919"
2194+
"version": "0.36.1.42791",
2195+
"templateHash": "14768176812719476461"
21962196
}
21972197
},
21982198
"parameters": {

infra/scripts/index_scripts/02_process_data.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@ def get_secrets_from_kv(kv_name, secret_name):
3232
openai_api_key = get_secrets_from_kv(key_vault_name, "AZURE-OPENAI-KEY")
3333
openai_api_base = get_secrets_from_kv(key_vault_name, "AZURE-OPENAI-ENDPOINT")
3434
openai_api_version = get_secrets_from_kv(key_vault_name, "AZURE-OPENAI-PREVIEW-API-VERSION")
35-
deployment = get_secrets_from_kv(key_vault_name, "AZURE-OPEN-AI-DEPLOYMENT-MODEL") # "gpt-4o-mini"
35+
deployment = get_secrets_from_kv(key_vault_name, "AZURE-OPEN-AI-DEPLOYMENT-MODEL")
3636

3737

3838
# Function: Get Embeddings

0 commit comments

Comments
 (0)