Skip to content

Conversation

@NirajC-Microsoft
Copy link
Owner

Purpose

This pull request updates the deployment workflow to significantly increase the minimum GPT model capacity and improves quota checking logic in the infrastructure script. The main focus is on scaling up resource allocation for GPT models and ensuring quota checks are more robust.

Resource allocation changes

  • Increased the GPT_MIN_CAPACITY environment variable from 1 to 150 in .github/workflows/deploy.yml to support higher model capacity during deployment.
  • Updated all related references to GPT_MIN_CAPACITY in the workflow steps and Azure deployment command to use the new value (150 instead of 1). [1] [2]

Quota checking improvements

  • In infra/scripts/checkquota.sh, set INSUFFICIENT_QUOTA=true when no quota information is found for a model in a region, improving error handling and reporting for quota checks.
  • ...

Does this introduce a breaking change?

  • Yes
  • No

How to Test

  • Get the code
git clone [repo-address]
cd [repo-name]
git checkout [branch-name]
npm install
  • Test the code

What to Check

Verify that the following are valid

  • ...

Other Information

@NirajC-Microsoft NirajC-Microsoft merged commit a33917c into dev-v3 Oct 8, 2025
3 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant