Skip to content

Commit 819ead9

Browse files
Merge branch 'Azure:main' into main
2 parents e7f1be7 + 8d61c2f commit 819ead9

File tree

1,096 files changed

+347475
-243139
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

1,096 files changed

+347475
-243139
lines changed

.github/CODEOWNERS

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -337,3 +337,5 @@
337337
/src/storage-discovery/ @shanefujs @calvinhzy
338338

339339
/src/aks-agent/ @feiskyer @mainred @nilo19
340+
341+
/src/migrate/ @saifaldin14

.github/workflows/TestTriggerExtensionRelease.yml

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ permissions:
1313

1414
jobs:
1515
build:
16-
name: Trigger Extension Release Pipeline
16+
name: Test Trigger Extension Release Pipeline
1717
runs-on: ubuntu-latest
1818
steps:
1919
- name: Harden Runner
@@ -26,7 +26,7 @@ jobs:
2626
client-id: ${{ secrets.ADO_SP_ClientID }}
2727
tenant-id: ${{ secrets.ADO_SP_TenantID }}
2828
allow-no-subscriptions: true
29-
- name: Trigger ADO Pipeline and Wait for Completion
29+
- name: Test Trigger ADO Pipeline and Wait for Completion
3030
uses: azure/cli@v2
3131
env:
3232
ado-org: ${{secrets.ADO_ORGANIZATION}}
@@ -89,7 +89,7 @@ jobs:
8989
exit 1
9090
fi
9191
92-
# Wait 30 seconds before checking again
93-
echo "Build still running... waiting 30 seconds"
94-
sleep 30
92+
# Wait 60 seconds before checking again
93+
echo "Build still running... waiting 60 seconds"
94+
sleep 60
9595
done

.github/workflows/TriggerExtensionRelease.yml

Lines changed: 59 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ jobs:
2626
client-id: ${{ secrets.ADO_SP_ClientID }}
2727
tenant-id: ${{ secrets.ADO_SP_TenantID }}
2828
allow-no-subscriptions: true
29-
- name: Azure CLI
29+
- name: Trigger ADO Pipeline and Wait for Completion
3030
uses: azure/cli@v2
3131
env:
3232
ado-org: ${{secrets.ADO_ORGANIZATION}}
@@ -35,4 +35,61 @@ jobs:
3535
commit-id: ${{ github.sha }}
3636
with:
3737
inlineScript: |
38-
az pipelines build queue --definition-id ${{ env.ado-pipeline-id }} --organization ${{ env.ado-org }} --project ${{ env.ado-project }} --variables commit_id=${{ env.commit-id }}
38+
# Trigger the pipeline and capture the build ID
39+
echo "Triggering ADO pipeline..."
40+
BUILD_RESULT=$(az pipelines build queue \
41+
--definition-id ${{ env.ado-pipeline-id }} \
42+
--organization ${{ env.ado-org }} \
43+
--project ${{ env.ado-project }} \
44+
--variables commit_id=${{ env.commit-id }} \
45+
--output json)
46+
47+
BUILD_ID=$(echo $BUILD_RESULT | jq -r '.id')
48+
echo "Pipeline triggered with Build ID: $BUILD_ID"
49+
50+
if [ "$BUILD_ID" = "null" ] || [ -z "$BUILD_ID" ]; then
51+
echo "Failed to get build ID from pipeline trigger"
52+
exit 1
53+
fi
54+
55+
# Wait for the build to complete
56+
echo "Waiting for build $BUILD_ID to complete..."
57+
while true; do
58+
BUILD_JSON=$(az pipelines build show \
59+
--id $BUILD_ID \
60+
--organization ${{ env.ado-org }} \
61+
--project ${{ env.ado-project }} \
62+
--output json)
63+
64+
BUILD_STATUS=$(echo "$BUILD_JSON" | jq -r '.status')
65+
BUILD_RESULT_STATUS=$(echo "$BUILD_JSON" | jq -r '.result // "none"')
66+
67+
echo "Current status: $BUILD_STATUS, Result: $BUILD_RESULT_STATUS"
68+
69+
# Check if build is completed
70+
if [ "$BUILD_STATUS" = "completed" ]; then
71+
echo "Build completed with result: $BUILD_RESULT_STATUS"
72+
73+
# Check if the build was successful
74+
if [ "$BUILD_RESULT_STATUS" = "succeeded" ]; then
75+
echo "✅ ADO pipeline build succeeded!"
76+
exit 0
77+
elif [ "$BUILD_RESULT_STATUS" = "partiallySucceeded" ]; then
78+
echo "⚠️ ADO pipeline build partially succeeded"
79+
exit 1
80+
else
81+
echo "❌ ADO pipeline build failed with result: $BUILD_RESULT_STATUS"
82+
exit 1
83+
fi
84+
fi
85+
86+
# Check for other terminal states
87+
if [ "$BUILD_STATUS" = "cancelling" ] || [ "$BUILD_STATUS" = "cancelled" ]; then
88+
echo "❌ ADO pipeline build was cancelled"
89+
exit 1
90+
fi
91+
92+
# Wait 60 seconds before checking again
93+
echo "Build still running... waiting 60 seconds"
94+
sleep 60
95+
done

scripts/ci/find_extension_upgraded.py

Lines changed: 0 additions & 141 deletions
This file was deleted.

src/aks-agent/HISTORY.rst

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,32 @@ To release a new version, please select a new version number (usually plus 1 to
1111

1212
Pending
1313
+++++++
14+
* Fix stdin reading hang in CI/CD pipelines by using select with timeout for non-interactive mode.
15+
* Update pytest marker registration and fix datetime.utcnow() deprecation warning in tests.
16+
* Improve test framework with real-time stderr output visibility and subprocess timeout.
17+
18+
1.0.0b7
19+
+++++++
20+
* Bump aks-mcp to v0.0.10 - here are the notable changes:
21+
* Fix: Improved server health check endpoints /health for both HTTP and SSE connections for http, sse
22+
* Fix: enforce json output for az monitor metrics and aks tools
23+
* Fix: Build the resource URL with correct MCP endpoint path based on transport
24+
* Fix feedback slash command
25+
26+
1.0.0b6
27+
+++++++
28+
* Introduce the new `az aks agent-init` command for better cli interaction.
29+
* Separate llm configuration from main agent command for improved clarity and extensibility.
30+
31+
1.0.0b5
32+
+++++++
33+
* Bump holmesgpt to 0.15.0 - Enhanced AI debugging experience and bug fixes
34+
* Added TODO list feature to allows holmes to reliably answers questions it wasn't able to answer before due to early-stopping
35+
* Fixed mcp server http connection fails when using socks proxy by adding the missing socks dependency
36+
* Fixed gpt-5 temperature bug by upgrading litellm and dropping non-1 values for temperature
37+
* Improved the installation time by removing unnecessary dependencies and move test dependencies to dev dependency group
38+
* Added Feedback slash command Feature to allow users to provide feedback on their experience with the agent performance
39+
* Disable prometheus toolset loading by default to workaround the libbz2-dev missing issue in Azure CLI python environment.
1440

1541
1.0.0b4
1642
+++++++

src/aks-agent/README.rst

Lines changed: 59 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -4,32 +4,41 @@ Azure CLI AKS Agent Extension
44
Introduction
55
============
66

7-
The AKS Agent extension provides the "az aks agent" command, an AI-powered assistant that
8-
helps analyze and troubleshoot Azure Kubernetes Service (AKS) clusters using Large Language
9-
Models (LLMs). The agent combines cluster context, configurable toolsets, and LLMs to answer
10-
natural-language questions about your cluster (for example, "Why are my pods not starting?")
11-
and can investigate issues in both interactive and non-interactive (batch) modes.
7+
8+
The AKS Agent extension provides the "az aks agent" command, an AI-powered assistant that helps analyze and troubleshoot Azure Kubernetes Service (AKS) clusters using Large Language Models (LLMs). The agent combines cluster context, configurable toolsets, and LLMs to answer natural-language questions about your cluster (for example, "Why are my pods not starting?") and can investigate issues in both interactive and non-interactive (batch) modes.
9+
10+
New in this version: **az aks agent-init** command for easy LLM model configuration!
11+
12+
You can now use `az aks agent-init` to interactively add and configure LLM models before asking questions. This command guides you through the setup process, allowing you to add multiple models as needed. When asking questions with `az aks agent`, you can:
13+
14+
- Use `--config-file` to specify your own model configuration file
15+
- Use `--model` to select a previously configured model
16+
- If neither is provided, the last configured LLM will be used by default
17+
18+
This makes it much easier to manage and switch between multiple models for your AKS troubleshooting workflows.
1219

1320
Key capabilities
1421
----------------
1522

23+
1624
- Interactive and non-interactive modes (use --no-interactive for batch runs).
17-
- Support for multiple LLM providers (Azure OpenAI, OpenAI, etc.) via environment variables.
18-
- Configurable via a JSON/YAML config file provided with --config-file.
25+
- Support for multiple LLM providers (Azure OpenAI, OpenAI, etc.) via interactive configuration.
26+
- **Easy model setup with `az aks agent-init`**: interactively add and configure LLM models, run multiple times to add more models.
27+
- Configurable via a JSON/YAML config file provided with --config-file, or select a model with --model.
28+
- If no config or model is specified, the last configured LLM is used automatically.
1929
- Control echo and tool output visibility with --no-echo-request and --show-tool-output.
2030
- Refresh the available toolsets with --refresh-toolsets.
2131
- Stay in traditional toolset mode by default, or opt in to aks-mcp integration with ``--aks-mcp`` when you need the enhanced capabilities.
2232

2333
Prerequisites
2434
-------------
25-
26-
Before using the agent, make sure provider-specific environment variables are set. For
27-
example, Azure OpenAI typically requires AZURE_API_BASE, AZURE_API_VERSION, and AZURE_API_KEY,
28-
while OpenAI requires OPENAI_API_KEY. For more details about supported providers and required
35+
No need to manually set environment variables! All model and credential information can be configured interactively using `az aks agent-init`.
36+
For more details about supported model providers and required
2937
variables, see: https://docs.litellm.ai/docs/providers
3038

39+
3140
Quick start and examples
32-
========================
41+
=========================
3342

3443
Install the extension
3544
---------------------
@@ -38,25 +47,58 @@ Install the extension
3847
3948
az extension add --name aks-agent
4049
41-
Run the agent (Azure OpenAI example)
50+
Configure LLM models interactively
51+
----------------------------------
52+
53+
.. code-block:: bash
54+
55+
az aks agent-init
56+
57+
This command will guide you through adding a new LLM model. You can run it multiple times to add more models or update existing models. All configured models are saved locally and can be selected when asking questions.
58+
59+
Run the agent (Azure OpenAI example) :
4260
-----------------------------------
4361

62+
**1. Use the last configured model (no extra parameters needed):**
63+
4464
.. code-block:: bash
4565
46-
export AZURE_API_BASE="https://my-azureopenai-service.openai.azure.com/"
47-
export AZURE_API_VERSION="2025-01-01-preview"
48-
export AZURE_API_KEY="sk-xxx"
66+
az aks agent "Why are my pods not starting?" --name MyManagedCluster --resource-group MyResourceGroup
67+
68+
**2. Specify a particular model you have configured:**
69+
70+
.. code-block:: bash
4971
5072
az aks agent "Why are my pods not starting?" --name MyManagedCluster --resource-group MyResourceGroup --model azure/my-gpt4.1-deployment
5173
74+
**3. Use a custom config file:**
75+
76+
.. code-block:: bash
77+
78+
az aks agent "Why are my pods not starting?" --config-file /path/to/your/model_config.yaml
79+
80+
5281
Run the agent (OpenAI example)
5382
------------------------------
5483

84+
**1. Use the last configured model (no extra parameters needed):**
85+
5586
.. code-block:: bash
5687
57-
export OPENAI_API_KEY="sk-xxx"
88+
az aks agent "Why are my pods not starting?" --name MyManagedCluster --resource-group MyResourceGroup
89+
90+
**2. Specify a particular model you have configured:**
91+
92+
.. code-block:: bash
93+
5894
az aks agent "Why are my pods not starting?" --name MyManagedCluster --resource-group MyResourceGroup --model gpt-4o
5995
96+
**3. Use a custom config file:**
97+
98+
.. code-block:: bash
99+
100+
az aks agent "Why are my pods not starting?" --config-file /path/to/your/model_config.yaml
101+
60102
Run in non-interactive batch mode
61103
---------------------------------
62104

0 commit comments

Comments
 (0)