chore: down merge main to dev #608
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Purpose
This pull request includes updates to deployment configurations, documentation, and scripts to improve resource allocation, retry mechanisms, and usability. The most important changes involve adjusting model capacities, enhancing retry logic for file uploads, and adding guidance for reusing existing Azure resources.
Deployment Configuration Updates:
.github/workflows/CAdeploy.yml: AdjustedGPT_MIN_CAPACITYto 200 andTEXT_EMBEDDING_MIN_CAPACITYto 80 to better align with resource requirements.Documentation Enhancements:
docs/CustomizingAzdParameters.md: Updated the description forAZURE_ENV_LOG_ANALYTICS_WORKSPACE_IDto include a link to a guide for obtaining an existing Workspace ID.docs/DeploymentGuide.md: Added a detailed section on reusing an existing Log Analytics Workspace, including a link to the new guide.docs/re-use-log-analytics.md: Created a new guide with step-by-step instructions for configuring an existing Log Analytics Workspace in Azure.Script Improvements:
infra/scripts/copy_kb_files.sh: Enhanced retry logic for Azure Blob Storage uploads by increasing the maximum retries to 5, introducing exponential backoff, and providing clearer error messages.Resource Allocation Adjustments:
docs/QuotaCheck.mdandinfra/scripts/quota_check_params.sh: Updated default model capacities forgpt-4o-minito 200 in documentation and scripts to reflect new resource allocations. [1] [2] [3]Does this introduce a breaking change?
Golden Path Validation
Deployment Validation
What to Check
Verify that the following are valid
Other Information