Skip to content

Conversation

@Priyanka-Microsoft
Copy link
Contributor

Purpose

  • ...
    This pull request updates default model capacities and environment variables to align with new requirements for resource allocation. The most significant changes include increasing the capacity for gpt-4o-mini and adjusting related documentation and scripts accordingly.

Updates to model capacities:

  • .github/workflows/CAdeploy.yml: Reduced GPT_MIN_CAPACITY from 250 to 200 and increased TEXT_EMBEDDING_MIN_CAPACITY from 40 to 80 in the workflow environment variables.
  • infra/scripts/quota_check_params.sh: Updated DEFAULT_MODEL_CAPACITY to set gpt-4o-mini capacity to 200 (previously 30) while keeping text-embedding-ada-002 at 80.

Documentation updates:

  • docs/QuotaCheck.md: Updated examples and default values to reflect the new gpt-4o-mini capacity of 200 in various usage scenarios. [1] [2]

Does this introduce a breaking change?

  • Yes
  • No

Golden Path Validation

  • I have tested the primary workflows (the "golden path") to ensure they function correctly without errors.

Deployment Validation

  • I have validated the deployment process successfully and all services are running as expected with this change.

What to Check

Verify that the following are valid

  • ...

Other Information

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR updates the minimum resource allocations for models to meet new capacity requirements.

  • Bump gpt-4o-mini default capacity from 30 to 200 and text-embedding-ada-002 from 40 to 80 in the quota check script.
  • Align documentation examples in docs/QuotaCheck.md with the new defaults.
  • Adjust CI workflow environment variables (GPT_MIN_CAPACITY, TEXT_EMBEDDING_MIN_CAPACITY) in .github/workflows/CAdeploy.yml.

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
infra/scripts/quota_check_params.sh Updated DEFAULT_MODEL_CAPACITY string to new thresholds
docs/QuotaCheck.md Revised example commands and default values for capacities
.github/workflows/CAdeploy.yml Lowered GPT_MIN_CAPACITY and increased TEXT_EMBEDDING_MIN_CAPACITY
Comments suppressed due to low confidence (2)

docs/QuotaCheck.md:55

  • The sample output section still shows the previous capacity values. Please update any example output to reflect the new default gpt-4o-mini capacity of 200.
### **Sample Output**

infra/scripts/quota_check_params.sh:50

  • Consider adding or updating tests for quota_check_params.sh to verify that the script correctly parses and applies the new default capacities.
DEFAULT_MODEL_CAPACITY="gpt-4o-mini:200,text-embedding-ada-002:80"

@Roopan-Microsoft Roopan-Microsoft merged commit ccbe67f into main Jul 14, 2025
9 checks passed
@Roopan-Microsoft Roopan-Microsoft deleted the update-model-capacity-similar-to-bicep branch July 14, 2025 13:04
@github-actions
Copy link

🎉 This PR is included in version 1.6.1 🎉

The release is available on GitHub release

Your semantic-release bot 📦🚀

Roopan-Microsoft added a commit that referenced this pull request Jul 17, 2025
* updated model capacity minimume to 200 (#602)

* docs: Added file and images (#577)

* Create exp.md

* Add files via upload

* Delete docs/images/re_use_log/exp.md

* Add files via upload

* Update CustomizingAzdParameters.md

* Update DeploymentGuide.md

add the section Reusing an Existing Log Analytics Workspace

* Update re-use-log-analytics.md

update the back link url and remove extra space

---------

Co-authored-by: Roopan-Microsoft <[email protected]>
Co-authored-by: Thanusree-Microsoft <[email protected]>

* docs:  Add and update links for CustomizingAzdParameters.md and re-use-log-analytics.md (#606)

* Update CustomizingAzdParameters.md

Added link

* Update re-use-log-analytics.md

Updated link

* fix: Increase retry attempts and improve error messaging for Azure Blob Storage uploads (#607)

---------

Co-authored-by: Priyanka-Microsoft <[email protected]>
Co-authored-by: Atulku-Microsoft <[email protected]>
Co-authored-by: Roopan-Microsoft <[email protected]>
Co-authored-by: Thanusree-Microsoft <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants