Skip to content

Conversation

@Vamshi-Microsoft
Copy link
Contributor

Purpose

This pull request introduces multiple changes across documentation, infrastructure, scripts, and backend code to enhance functionality, improve clarity, and update configurations. Key updates include adjustments to AI model capacities, documentation improvements for reusing existing resources, and the addition of a SQL query generator agent. Below is a categorized summary of the most important changes:

AI Model Capacity Updates:

  • Reduced GPT_MIN_CAPACITY from 250 to 200 and increased TEXT_EMBEDDING_MIN_CAPACITY from 40 to 80 in .github/workflows/CAdeploy.yml.
  • Updated default capacities for gpt-4o-mini to 200 and text-embedding-ada-002 to 80 in docs/QuotaCheck.md and infra/scripts/quota_check_params.sh. [1] [2] [3]

Documentation Enhancements:

  • Added a detailed guide for reusing an existing Log Analytics Workspace in docs/re-use-log-analytics.md and linked it in other documentation files. [1] [2] [3]

Infrastructure Updates:

  • Changed the SKU for the embedding model from Standard to GlobalStandard in infra/deploy_ai_foundry.bicep and corresponding JSON files. [1] [2] [3]
  • Updated retry mechanism for file uploads in infra/scripts/copy_kb_files.sh to increase retries from 3 to 5 and added exponential backoff with detailed logging.

Backend Enhancements:

  • Added a new SQL query generator agent in src/App/backend/agents/agent_factory.py to generate T-SQL queries based on user input and predefined rules.
  • Removed the greeting function from src/App/backend/plugins/chat_with_data_plugin.py as conversational responses are now handled by updated agent instructions.

These changes collectively improve deployment configurations, enhance user guidance, optimize infrastructure, and expand backend capabilities.

Does this introduce a breaking change?

  • Yes
  • No

Golden Path Validation

  • I have tested the primary workflows (the "golden path") to ensure they function correctly without errors.

Deployment Validation

  • I have validated the deployment process successfully and all services are running as expected with this change.

What to Check

Verify that the following are valid

  • ...

Other Information

Vamshi-Microsoft and others added 19 commits July 15, 2025 10:24
* updated model capacity minimume to 200 (#602)

* docs: Added file and images (#577)

* Create exp.md

* Add files via upload

* Delete docs/images/re_use_log/exp.md

* Add files via upload

* Update CustomizingAzdParameters.md

* Update DeploymentGuide.md

add the section Reusing an Existing Log Analytics Workspace

* Update re-use-log-analytics.md

update the back link url and remove extra space

---------

Co-authored-by: Roopan-Microsoft <[email protected]>
Co-authored-by: Thanusree-Microsoft <[email protected]>

* docs:  Add and update links for CustomizingAzdParameters.md and re-use-log-analytics.md (#606)

* Update CustomizingAzdParameters.md

Added link

* Update re-use-log-analytics.md

Updated link

* fix: Increase retry attempts and improve error messaging for Azure Blob Storage uploads (#607)

---------

Co-authored-by: Priyanka-Microsoft <[email protected]>
Co-authored-by: Atulku-Microsoft <[email protected]>
Co-authored-by: Roopan-Microsoft <[email protected]>
Co-authored-by: Thanusree-Microsoft <[email protected]>
fix: Update agent instructions to restrict web search conversational inputs
feat: migrate model types to GlobalStandard
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request introduces a major architectural refactoring that integrates the Agent SDK into the kernel, migrates to GlobalStandard model types for Azure OpenAI deployments, removes the greeting function from the ChatWithDataPlugin, and updates the orchestrator to handle conversational interactions through enhanced agent instructions.

Key changes include:

  • Migration from direct OpenAI calls to Agent SDK for SQL query generation
  • Infrastructure updates to use GlobalStandard SKU for text-embedding models
  • Removal of dedicated greeting functionality in favor of enhanced agent instructions

Reviewed Changes

Copilot reviewed 15 out of 18 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
src/App/backend/plugins/chat_with_data_plugin.py Removed greeting function and migrated SQL query generation to use Agent SDK
src/App/backend/agents/agent_factory.py Added new SQL agent factory method with enhanced instructions for conversational handling
src/App/tests/backend/plugins/test_chat_with_data_plugin.py Updated tests to reflect Agent SDK migration and removed greeting tests
infra/deploy_ai_foundry.bicep Changed embedding model SKU from Standard to GlobalStandard
docs/re-use-log-analytics.md Added new documentation for reusing existing Log Analytics workspaces
Comments suppressed due to low confidence (1)

src/App/tests/backend/plugins/test_chat_with_data_plugin.py:153

  • The test should verify that mock_get_connection was called to ensure the error handling path is properly tested when the agent succeeds but database connection fails.
        assert "Error retrieving SQL data" in result

Comment on lines +119 to +137
agent_instructions = config.SQL_SYSTEM_PROMPT or """
You are an expert assistant in generating T-SQL queries based on user questions.
Always use the following schema:
1. Table: Clients (ClientId, Client, Email, Occupation, MaritalStatus, Dependents)
2. Table: InvestmentGoals (ClientId, InvestmentGoal)
3. Table: Assets (ClientId, AssetDate, Investment, ROI, Revenue, AssetType)
4. Table: ClientSummaries (ClientId, ClientSummary)
5. Table: InvestmentGoalsDetails (ClientId, InvestmentGoal, TargetAmount, Contribution)
6. Table: Retirement (ClientId, StatusDate, RetirementGoalProgress, EducationGoalProgress)
7. Table: ClientMeetings (ClientId, ConversationId, Title, StartTime, EndTime, Advisor, ClientEmail)
Rules:
- Always filter by ClientId = <provided>
- Do not use client name for filtering
- Assets table contains snapshots by date; do not sum values across dates
- Use StartTime for time-based filtering (meetings)
- Only return the raw T-SQL query. No explanations or comments.
"""

Copy link

Copilot AI Jul 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The fallback instructions should be stored as a constant rather than embedded as a multi-line string. This improves maintainability and makes the instructions easier to update.

Suggested change
agent_instructions = config.SQL_SYSTEM_PROMPT or """
You are an expert assistant in generating T-SQL queries based on user questions.
Always use the following schema:
1. Table: Clients (ClientId, Client, Email, Occupation, MaritalStatus, Dependents)
2. Table: InvestmentGoals (ClientId, InvestmentGoal)
3. Table: Assets (ClientId, AssetDate, Investment, ROI, Revenue, AssetType)
4. Table: ClientSummaries (ClientId, ClientSummary)
5. Table: InvestmentGoalsDetails (ClientId, InvestmentGoal, TargetAmount, Contribution)
6. Table: Retirement (ClientId, StatusDate, RetirementGoalProgress, EducationGoalProgress)
7. Table: ClientMeetings (ClientId, ConversationId, Title, StartTime, EndTime, Advisor, ClientEmail)
Rules:
- Always filter by ClientId = <provided>
- Do not use client name for filtering
- Assets table contains snapshots by date; do not sum values across dates
- Use StartTime for time-based filtering (meetings)
- Only return the raw T-SQL query. No explanations or comments.
"""
agent_instructions = config.SQL_SYSTEM_PROMPT or SQL_AGENT_FALLBACK_PROMPT

Copilot uses AI. Check for mistakes.
name="SQLQueryGeneratorAgent",
)

cls._sql_agent = {"agent": agent, "client": project_client}
Copy link

Copilot AI Jul 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The singleton pattern implementation lacks proper cleanup. Consider adding a cleanup method similar to delete_all_agent_instance for the SQL agent to prevent resource leaks.

Copilot uses AI. Check for mistakes.
Comment on lines 124 to 176
az storage blob upload-batch --account-name "$storageAccount" --destination data/"$extractedFolder1" --source $extractionPath1 --auth-mode login --pattern '*' --overwrite --output none
if [ $? -ne 0 ]; then
retries=3
maxRetries=5
retries=$maxRetries
sleepTime=10
echo "Error: Failed to upload files to Azure Blob Storage. Retrying upload...($((4 - retries)) of 3)"
attempt=1
while [ $retries -gt 0 ]; do
echo "Error: Failed to upload files to Azure Blob Storage. Retrying upload...$attempt of $maxRetries in $sleepTime seconds"
sleep $sleepTime
az storage blob upload-batch --account-name "$storageAccount" --destination data/"$extractedFolder1" --source $extractionPath1 --auth-mode login --pattern '*' --overwrite --output none
if [ $? -eq 0 ]; then
echo "Files uploaded successfully to Azure Blob Storage."
break
else
((retries--))
echo "Retrying upload... ($((4 - retries)) of 3)"
((attempt++))
sleepTime=$((sleepTime * 2))
sleep $sleepTime
fi
done
exit 1
if [ $retries -eq 0 ]; then
echo "Error: Failed to upload files after all retry attempts."
exit 1
fi
else
echo "Files uploaded successfully to Azure Blob Storage."
fi

az storage blob upload-batch --account-name "$storageAccount" --destination data/"$extractedFolder2" --source $extractionPath2 --auth-mode login --pattern '*' --overwrite --output none
if [ $? -ne 0 ]; then
retries=3
maxRetries=5
retries=$maxRetries
attempt=1
sleepTime=10
echo "Error: Failed to upload files to Azure Blob Storage. Retrying upload...($((4 - retries)) of 3)"
while [ $retries -gt 0 ]; do
echo "Error: Failed to upload files to Azure Blob Storage. Retrying upload...$attempt of $maxRetries in $sleepTime seconds"
sleep $sleepTime
az storage blob upload-batch --account-name "$storageAccount" --destination data/"$extractedFolder2" --source $extractionPath2 --auth-mode login --pattern '*' --overwrite --output none
if [ $? -eq 0 ]; then
echo "Files uploaded successfully to Azure Blob Storage."
break
else
((retries--))
echo "Retrying upload... ($((4 - retries)) of 3)"
((attempt++))
sleepTime=$((sleepTime * 2))
sleep $sleepTime
fi
done
exit 1
if [ $retries -eq 0 ]; then
echo "Error: Failed to upload files after all retry attempts."
exit 1
fi
else
echo "Files uploaded successfully to Azure Blob Storage."
fi
Copy link

Copilot AI Jul 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The retry logic duplicates the same code block twice with only minor variations. Consider extracting this into a reusable function to reduce code duplication.

Copilot uses AI. Check for mistakes.
@Avijit-Microsoft Avijit-Microsoft merged commit b684d25 into main Jul 22, 2025
15 checks passed
@github-actions
Copy link

🎉 This PR is included in version 1.7.0 🎉

The release is available on GitHub release

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants