-
Notifications
You must be signed in to change notification settings - Fork 239
fix: Refactor data processing with manual scripts to resolve deployment script SFI issue #687
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
- Introduced `04_cu_process_custom_data.py` for processing custom data and integrating with Azure services. - Removed obsolete `azure_credential_utils.py` as its functionality is now integrated elsewhere. - Updated `content_understanding_client.py` to improve error handling. - Created `process_custom_data_scripts.sh` for streamlined script execution and dependency management. - Enhanced `process_data_scripts.sh` to include additional parameters and improved error handling. - Refactored `run_create_index_scripts.sh` to support Azure authentication and role assignment. - Deleted `run_create_index_scripts_manual.sh` as its functionality is now covered in the updated script. - Adjusted `run_process_data_scripts.sh` to reference the new Bicep file for custom data processing.
…ole assignments, and error handling; remove run_process_data_scripts.sh
…ove obsolete PowerShell script
…ss-platform support
…improve virtual environment handling
…mands in Azure YAML and update SQL output directory path in Python script
…rove error handling in bash script for enabling public access
…lt dependencies and streamline parameter handling - Removed Key Vault related parameters and configurations from Bicep templates. - Updated Python scripts to accept command line arguments for necessary endpoints and models instead of retrieving them from Key Vault. - Modified shell scripts to pass new parameters to Python scripts for improved flexibility and clarity. - Cleaned up unused variables and consolidated logic for better maintainability.
…essing data - Introduced a new script `process_custom_data.sh` to manage public network access for Azure resources and execute data processing. - Implemented functions to enable and restore public access for Storage Account, AI Foundry, CU Foundry, and SQL Server. - Added error handling and logging for network access changes. - Refactored existing `process_sample_data.sh` to remove deployment output retrieval logic, now handled in `process_custom_data.sh`. - Removed SQL table creation logic from `run_create_index_scripts.sh` to streamline the process.
…in data processing script
…t processing script
…le data with new parameters
…Azure services and Content Understanding API
…SSQL ODBC driver and correct script permissions
…ion to process_custom_data.sh
… improve SQL Server public access feedback in scripts
…rivate endpoint management and remove secrets export configuration
…edge-Mining-Solution-Accelerator into pk-km-sampledata-manual
…Debian/Ubuntu" This reverts commit 7e51dd3.
Avijit-Microsoft
approved these changes
Dec 15, 2025
Contributor
|
🎉 This PR is included in version 3.17.0 🎉 The release is available on GitHub release Your semantic-release bot 📦🚀 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Purpose
This pull request introduces several infrastructure and documentation updates to improve deployment clarity, environment setup, and configuration flexibility. The most significant changes include the removal of the Key Vault module and associated deployment scripts from the Bicep template, expanded documentation for data processing scripts, and updates to Python version requirements. Additionally, new outputs and parameters have been added to the Bicep template to support integration with AI and storage services.
Infrastructure changes:
infra/main.bicep, simplifying the deployment and secret management approach.infra/main.bicep, delegating these tasks to manual or external script execution.infra/main.bicep. [1] [2]Documentation and setup improvements:
DeploymentGuide.mdandCustomizeData.mdto instruct users to run newprocess_sample_data.shandprocess_custom_data.shscripts for data processing, including detailed parameter instructions. [1] [2]azure.yamlto guide users on processing sample data via Bash scripts.Dev environment updates:
mssql-odbc-driverfeature (version 17) to the devcontainer configuration for improved SQL Server connectivity.setup_env.shto reference the new data processing scripts and ensure correct permissions.Does this introduce a breaking change?
Golden Path Validation
Deployment Validation