Skip to content

Commit 0f0fbd4

Browse files
committed
enable aks agent command
1 parent fb7fc37 commit 0f0fbd4

File tree

5 files changed

+171
-175
lines changed

5 files changed

+171
-175
lines changed

src/aks-preview/HISTORY.rst

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,11 @@ Pending
1313
+++++++
1414
* Add framework for interactive AI-powered debugging tool.
1515

16+
18.0.0b27
17+
+++++++
18+
* Add interactive AI-powered debugging tool `az aks agent`.
19+
* Add framework for interactive AI-powered debugging tool.
20+
1621
18.0.0b26
1722
+++++++
1823
* Add `az aks identity-binding` command group for identity binding feature.

src/aks-preview/azext_aks_preview/_help.py

Lines changed: 96 additions & 97 deletions
Original file line numberDiff line numberDiff line change
@@ -3944,100 +3944,99 @@
39443944
short-summary: Name of the identity binding to show.
39453945
"""
39463946

3947-
# pylint: disable=line-too-long
3948-
# helps[
3949-
# "aks agent"
3950-
# ] = """
3951-
# type: command
3952-
# short-summary: Run AI assistant to analyze and troubleshoot Kubernetes clusters.
3953-
# long-summary: |-
3954-
# This command allows you to ask questions about your Azure Kubernetes cluster and get answers using AI models.
3955-
# Environment variables must be set to use the AI model, please refer to https://docs.litellm.ai/docs/providers to learn more about supported AI providers and models and required environment variables.
3956-
# parameters:
3957-
# - name: --name -n
3958-
# type: string
3959-
# short-summary: Name of the managed cluster.
3960-
# - name: --resource-group -g
3961-
# type: string
3962-
# short-summary: Name of the resource group.
3963-
# - name: --model
3964-
# type: string
3965-
# short-summary: Model to use for the LLM.
3966-
# - name: --api-key
3967-
# type: string
3968-
# short-summary: API key to use for the LLM (if not given, uses environment variables AZURE_API_KEY, OPENAI_API_KEY).
3969-
# - name: --config-file
3970-
# type: string
3971-
# short-summary: Path to configuration file.
3972-
# - name: --max-steps
3973-
# type: int
3974-
# short-summary: Maximum number of steps the LLM can take to investigate the issue.
3975-
# - name: --no-interactive
3976-
# type: bool
3977-
# short-summary: Disable interactive mode. When set, the agent will not prompt for input and will run in batch mode.
3978-
# - name: --no-echo-request
3979-
# type: bool
3980-
# short-summary: Disable echoing back the question provided to AKS Agent in the output.
3981-
# - name: --show-tool-output
3982-
# type: bool
3983-
# short-summary: Show the output of each tool that was called during the analysis.
3984-
# - name: --refresh-toolsets
3985-
# type: bool
3986-
# short-summary: Refresh the toolsets status.
3987-
#
3988-
# examples:
3989-
# - name: Ask about pod issues in the cluster with Azure OpenAI
3990-
# text: |-
3991-
# export AZURE_API_BASE="https://my-azureopenai-service.openai.azure.com/"
3992-
# export AZURE_API_VERSION="2025-01-01-preview"
3993-
# export AZURE_API_KEY="sk-xxx"
3994-
# az aks agent "Why are my pods not starting?" --name MyManagedCluster --resource-group MyResourceGroup --model azure/my-gpt4.1-deployment
3995-
# - name: Ask about pod issues in the cluster with OpenAI
3996-
# text: |-
3997-
# export OPENAI_API_KEY="sk-xxx"
3998-
# az aks agent "Why are my pods not starting?" --name MyManagedCluster --resource-group MyResourceGroup --model gpt-4o
3999-
# text: az aks agent "Why are my pods not starting?"
4000-
# - name: Run in interactive mode without a question
4001-
# text: az aks agent "Check the pod status in my cluster" --name MyManagedCluster --resource-group MyResourceGroup --model azure/my-gpt4.1-deployment --api-key "sk-xxx"
4002-
# - name: Run in non-interactive batch mode
4003-
# text: az aks agent "Diagnose networking issues" --no-interactive --max-steps 15 --model azure/my-gpt4.1-deployment
4004-
# - name: Show detailed tool output during analysis
4005-
# text: az aks agent "Why is my service workload unavailable in namespace workload-ns?" --show-tool-output --model azure/my-gpt4.1-deployment
4006-
# - name: Use custom configuration file
4007-
# text: az aks agent "Check kubernetes pod resource usage" --config-file /path/to/custom.config --model azure/my-gpt4.1-deployment
4008-
# - name: Run agent with no echo of the original question
4009-
# text: az aks agent "What is the status of my cluster?" --no-echo-request --model azure/my-gpt4.1-deployment
4010-
# - name: Refresh toolsets to get the latest available tools
4011-
# text: az aks agent "What is the status of my cluster?" --refresh-toolsets --model azure/my-gpt4.1-deploymen
4012-
# - name: Run agent with config file
4013-
# text: |
4014-
# az aks agent "Check kubernetes pod resource usage" --config-file /path/to/custom.config
4015-
# Here is an example of config file:
4016-
# ```json
4017-
# model: "gpt-4o"
4018-
# api_key: "..."
4019-
# # define a list of mcp servers, mcp server can be defined
4020-
# mcp_servers:
4021-
# aks_mcp:
4022-
# description: "The AKS-MCP is a Model Context Protocol (MCP) server that enables AI assistants to interact with Azure Kubernetes Service (AKS) clusters"
4023-
# url: "http://localhost:8003/sse"
4024-
#
4025-
# # try adding your own tools or toggle the built-in toolsets here
4026-
# # e.g. query company-specific data, fetch logs from your existing observability tools, etc
4027-
# # To check how to add a customized toolset, please refer to https://docs.robusta.dev/master/configuration/holmesgpt/custom_toolsets.html#custom-toolsets
4028-
# # To find all built-in toolsets, please refer to https://docs.robusta.dev/master/configuration/holmesgpt/builtin_toolsets.html
4029-
# toolsets:
4030-
# # add a new json processor toolset
4031-
# json_processor:
4032-
# description: "A toolset for processing JSON data using jq"
4033-
# prerequisites:
4034-
# - command: "jq --version" # Ensure jq is installed
4035-
# tools:
4036-
# - name: "process_json"
4037-
# description: "A tool that uses jq to process JSON input"
4038-
# command: "echo '{{ json_input }}' | jq '.'" # Example jq command to format JSON
4039-
# # disable a built-in toolsets
4040-
# aks/core:
4041-
# enabled: false
4042-
# ```
4043-
# """
3947+
helps[
3948+
"aks agent"
3949+
] = """
3950+
type: command
3951+
short-summary: Run AI assistant to analyze and troubleshoot Kubernetes clusters.
3952+
long-summary: |-
3953+
This command allows you to ask questions about your Azure Kubernetes cluster and get answers using AI models.
3954+
Environment variables must be set to use the AI model, please refer to https://docs.litellm.ai/docs/providers to learn more about supported AI providers and models and required environment variables.
3955+
parameters:
3956+
- name: --name -n
3957+
type: string
3958+
short-summary: Name of the managed cluster.
3959+
- name: --resource-group -g
3960+
type: string
3961+
short-summary: Name of the resource group.
3962+
- name: --model
3963+
type: string
3964+
short-summary: Model to use for the LLM.
3965+
- name: --api-key
3966+
type: string
3967+
short-summary: API key to use for the LLM (if not given, uses environment variables AZURE_API_KEY, OPENAI_API_KEY).
3968+
- name: --config-file
3969+
type: string
3970+
short-summary: Path to configuration file.
3971+
- name: --max-steps
3972+
type: int
3973+
short-summary: Maximum number of steps the LLM can take to investigate the issue.
3974+
- name: --no-interactive
3975+
type: bool
3976+
short-summary: Disable interactive mode. When set, the agent will not prompt for input and will run in batch mode.
3977+
- name: --no-echo-request
3978+
type: bool
3979+
short-summary: Disable echoing back the question provided to AKS Agent in the output.
3980+
- name: --show-tool-output
3981+
type: bool
3982+
short-summary: Show the output of each tool that was called during the analysis.
3983+
- name: --refresh-toolsets
3984+
type: bool
3985+
short-summary: Refresh the toolsets status.
3986+
3987+
examples:
3988+
- name: Ask about pod issues in the cluster with Azure OpenAI
3989+
text: |-
3990+
export AZURE_API_BASE="https://my-azureopenai-service.openai.azure.com/"
3991+
export AZURE_API_VERSION="2025-01-01-preview"
3992+
export AZURE_API_KEY="sk-xxx"
3993+
az aks agent "Why are my pods not starting?" --name MyManagedCluster --resource-group MyResourceGroup --model azure/my-gpt4.1-deployment
3994+
- name: Ask about pod issues in the cluster with OpenAI
3995+
text: |-
3996+
export OPENAI_API_KEY="sk-xxx"
3997+
az aks agent "Why are my pods not starting?" --name MyManagedCluster --resource-group MyResourceGroup --model gpt-4o
3998+
text: az aks agent "Why are my pods not starting?"
3999+
- name: Run in interactive mode without a question
4000+
text: az aks agent "Check the pod status in my cluster" --name MyManagedCluster --resource-group MyResourceGroup --model azure/my-gpt4.1-deployment --api-key "sk-xxx"
4001+
- name: Run in non-interactive batch mode
4002+
text: az aks agent "Diagnose networking issues" --no-interactive --max-steps 15 --model azure/my-gpt4.1-deployment
4003+
- name: Show detailed tool output during analysis
4004+
text: az aks agent "Why is my service workload unavailable in namespace workload-ns?" --show-tool-output --model azure/my-gpt4.1-deployment
4005+
- name: Use custom configuration file
4006+
text: az aks agent "Check kubernetes pod resource usage" --config-file /path/to/custom.config --model azure/my-gpt4.1-deployment
4007+
- name: Run agent with no echo of the original question
4008+
text: az aks agent "What is the status of my cluster?" --no-echo-request --model azure/my-gpt4.1-deployment
4009+
- name: Refresh toolsets to get the latest available tools
4010+
text: az aks agent "What is the status of my cluster?" --refresh-toolsets --model azure/my-gpt4.1-deploymen
4011+
- name: Run agent with config file
4012+
text: |
4013+
az aks agent "Check kubernetes pod resource usage" --config-file /path/to/custom.config
4014+
Here is an example of config file:
4015+
```json
4016+
model: "gpt-4o"
4017+
api_key: "..."
4018+
# define a list of mcp servers, mcp server can be defined
4019+
mcp_servers:
4020+
aks_mcp:
4021+
description: "The AKS-MCP is a Model Context Protocol (MCP) server that enables AI assistants to interact with Azure Kubernetes Service (AKS) clusters"
4022+
url: "http://localhost:8003/sse"
4023+
4024+
# try adding your own tools or toggle the built-in toolsets here
4025+
# e.g. query company-specific data, fetch logs from your existing observability tools, etc
4026+
# To check how to add a customized toolset, please refer to https://docs.robusta.dev/master/configuration/holmesgpt/custom_toolsets.html#custom-toolsets
4027+
# To find all built-in toolsets, please refer to https://docs.robusta.dev/master/configuration/holmesgpt/builtin_toolsets.html
4028+
toolsets:
4029+
# add a new json processor toolset
4030+
json_processor:
4031+
description: "A toolset for processing JSON data using jq"
4032+
prerequisites:
4033+
- command: "jq --version" # Ensure jq is installed
4034+
tools:
4035+
- name: "process_json"
4036+
description: "A tool that uses jq to process JSON input"
4037+
command: "echo '{{ json_input }}' | jq '.'" # Example jq command to format JSON
4038+
# disable a built-in toolsets
4039+
aks/core:
4040+
enabled: false
4041+
```
4042+
"""

src/aks-preview/azext_aks_preview/_params.py

Lines changed: 65 additions & 66 deletions
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@
2323
validate_nat_gateway_idle_timeout,
2424
validate_nat_gateway_managed_outbound_ip_count,
2525
)
26-
# from azure.cli.core.api import get_config_dir
26+
from azure.cli.core.api import get_config_dir
2727
from azure.cli.core.commands.parameters import (
2828
edge_zone_type,
2929
file_type,
@@ -224,7 +224,7 @@
224224
validate_max_blocked_nodes,
225225
validate_resource_group_parameter,
226226
validate_location_resource_group_cluster_parameters,
227-
# validate_agent_config_file,
227+
validate_agent_config_file,
228228
)
229229
from azext_aks_preview.azurecontainerstorage._consts import (
230230
CONST_ACSTOR_ALL,
@@ -2777,70 +2777,69 @@ def load_arguments(self, _):
27772777
action="store_true",
27782778
)
27792779

2780-
# pylint: disable=line-too-long
2781-
# with self.argument_context("aks agent") as c:
2782-
# c.positional(
2783-
# "prompt",
2784-
# help="Ask any question and answer using available tools.",
2785-
# )
2786-
# c.argument(
2787-
# "resource_group_name",
2788-
# options_list=["--resource-group", "-g"],
2789-
# help="Name of resource group.",
2790-
# required=False,
2791-
# )
2792-
# c.argument(
2793-
# "name",
2794-
# options_list=["--name", "-n"],
2795-
# help="Name of the managed cluster.",
2796-
# required=False,
2797-
# )
2798-
# c.argument(
2799-
# "max_steps",
2800-
# type=int,
2801-
# default=10,
2802-
# required=False,
2803-
# help="Maximum number of steps the LLM can take to investigate the issue.",
2804-
# )
2805-
# c.argument(
2806-
# "config_file",
2807-
# default=os.path.join(get_config_dir(), "aksAgent.config"),
2808-
# validator=validate_agent_config_file,
2809-
# required=False,
2810-
# help="Path to the config file.",
2811-
# )
2812-
# c.argument(
2813-
# "model",
2814-
# help="The model to use for the LLM.",
2815-
# required=False,
2816-
# type=str,
2817-
# )
2818-
# c.argument(
2819-
# "api-key",
2820-
# help="API key to use for the LLM (if not given, uses environment variables AZURE_API_KEY, OPENAI_API_KEY)",
2821-
# required=False,
2822-
# type=str,
2823-
# )
2824-
# c.argument(
2825-
# "no_interactive",
2826-
# help="Disable interactive mode. When set, the agent will not prompt for input and will run in batch mode.",
2827-
# action="store_true",
2828-
# )
2829-
# c.argument(
2830-
# "no_echo_request",
2831-
# help="Disable echoing back the question provided to AKS Agent in the output.",
2832-
# action="store_true",
2833-
# )
2834-
# c.argument(
2835-
# "show_tool_output",
2836-
# help="Show the output of each tool that was called.",
2837-
# action="store_true",
2838-
# )
2839-
# c.argument(
2840-
# "refresh_toolsets",
2841-
# help="Refresh the toolsets status.",
2842-
# action="store_true",
2843-
# )
2780+
with self.argument_context("aks agent") as c:
2781+
c.positional(
2782+
"prompt",
2783+
help="Ask any question and answer using available tools.",
2784+
)
2785+
c.argument(
2786+
"resource_group_name",
2787+
options_list=["--resource-group", "-g"],
2788+
help="Name of resource group.",
2789+
required=False,
2790+
)
2791+
c.argument(
2792+
"name",
2793+
options_list=["--name", "-n"],
2794+
help="Name of the managed cluster.",
2795+
required=False,
2796+
)
2797+
c.argument(
2798+
"max_steps",
2799+
type=int,
2800+
default=10,
2801+
required=False,
2802+
help="Maximum number of steps the LLM can take to investigate the issue.",
2803+
)
2804+
c.argument(
2805+
"config_file",
2806+
default=os.path.join(get_config_dir(), "aksAgent.config"),
2807+
validator=validate_agent_config_file,
2808+
required=False,
2809+
help="Path to the config file.",
2810+
)
2811+
c.argument(
2812+
"model",
2813+
help="The model to use for the LLM.",
2814+
required=False,
2815+
type=str,
2816+
)
2817+
c.argument(
2818+
"api-key",
2819+
help="API key to use for the LLM (if not given, uses environment variables AZURE_API_KEY, OPENAI_API_KEY)",
2820+
required=False,
2821+
type=str,
2822+
)
2823+
c.argument(
2824+
"no_interactive",
2825+
help="Disable interactive mode. When set, the agent will not prompt for input and will run in batch mode.",
2826+
action="store_true",
2827+
)
2828+
c.argument(
2829+
"no_echo_request",
2830+
help="Disable echoing back the question provided to AKS Agent in the output.",
2831+
action="store_true",
2832+
)
2833+
c.argument(
2834+
"show_tool_output",
2835+
help="Show the output of each tool that was called.",
2836+
action="store_true",
2837+
)
2838+
c.argument(
2839+
"refresh_toolsets",
2840+
help="Refresh the toolsets status.",
2841+
action="store_true",
2842+
)
28442843

28452844

28462845
def _get_default_install_location(exe_name):

src/aks-preview/azext_aks_preview/agent/prompt.py

Lines changed: 4 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -37,17 +37,10 @@
3737
1. **IMMEDIATELY STOP ALL OPERATIONS** - Do not proceed with any investigation
3838
2. **DO NOT ATTEMPT ANY TROUBLESHOOTING** - No kubectl commands, no Azure commands, nothing
3939
3. **DO NOT INFER THE RESOURCE NAME** - Do not assume any resource name, resource group, or subscription ID
40-
4. **ONLY display the context failure message** on separate lines:
41-
```
42-
Cluster name: <detected_or_not_found>
43-
Resource group: <detected_or_not_found>
44-
Subscription ID: <detected_or_not_found>
45-
46-
Please provide the correct cluster context. You can either:
47-
1. Specify the context in this session: "Please use cluster 'my-cluster' in resource group 'my-rg' under subscription 'my-subscription'"
48-
2. Or restart with context: `az aks agent --name <cluster-name> --resource-group <rg-name> --subscription <subscription-id>`
49-
```
50-
**IMPORTANT**: When displaying the CLI command example above, use it EXACTLY as written with the placeholder format `<cluster-name>`, `<rg-name>`, `<subscription-id>`.
40+
4. **ONLY display the context failure message** exactly as follows with no extra blank lines (replace the first three placeholders with actual detected values or None):
41+
- list "Cluster name", "Resource group", "Subscription ID" with detected value or None
42+
- prompt to the user to either provide the the cluster context in the prompt including Cluster name", "Resource group" and "Subscription ID", or
43+
- restart the command specifying the cluster info in flags with examples (e.g., --name <cluster_name> --resource-group <resource_group> --subscription <subscription_id>)
5144
5245
{% endif %}
5346

src/aks-preview/azext_aks_preview/commands.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -188,7 +188,7 @@ def load_command_table(self, _):
188188
"operation-abort", "aks_operation_abort", supports_no_wait=True
189189
)
190190
g.custom_command("bastion", "aks_bastion")
191-
# g.custom_command("agent", "aks_agent")
191+
g.custom_command("agent", "aks_agent")
192192

193193
# AKS maintenance configuration commands
194194
with self.command_group(

0 commit comments

Comments
 (0)