Skip to content

Commit 8301a6a

Browse files
authored
Merge branch 'main' into arm-template-update
2 parents 2905bb9 + 2a9ff9e commit 8301a6a

File tree

113 files changed

+1317
-513
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

113 files changed

+1317
-513
lines changed

AKS-Arc/TOC.yml

Lines changed: 15 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -151,24 +151,31 @@
151151
href: aks-troubleshoot.md
152152
- name: Control plane configuration validation errors
153153
href: control-plane-validation-errors.md
154-
- name: Connectivity issues with MetalLB
155-
href: load-balancer-issues.md
156154
- name: K8sVersionValidation error
157155
href: cluster-k8s-version.md
158156
- name: Use diagnostic checker
159157
href: aks-arc-diagnostic-checker.md
160158
- name: KubeAPIServer unreachable error
161159
href: kube-api-server-unreachable.md
162-
- name: Can't see VM SKUs on Azure portal
163-
href: check-vm-sku.md
164-
- name: Deleted AKS Arc cluster still visible on Azure portal
165-
href: deleted-cluster-visible.md
160+
- name: Can't create/scale AKS cluster due to image issues
161+
href: gallery-image-not-usable.md
162+
- name: Disk space exhaustion on control plane VMs
163+
href: kube-apiserver-log-overflow.md
164+
- name: Telemetry pod consumes too much memory and CPU
165+
href: telemetry-pod-resources.md
166+
- name: Issues after deleting storage volumes
167+
href: delete-storage-volume.md
166168
- name: Can't fully delete AKS Arc cluster with PodDisruptionBudget (PDB) resources
167169
href: delete-cluster-pdb.md
168170
- name: Azure Advisor upgrade recommendation
169171
href: azure-advisor-upgrade.md
170-
- name: Issues after deleting storage volumes
171-
href: delete-storage-volume.md
172+
- name: Deleted AKS Arc cluster still visible on Azure portal
173+
href: deleted-cluster-visible.md
174+
- name: Can't see VM SKUs on Azure portal
175+
href: check-vm-sku.md
176+
- name: Connectivity issues with MetalLB
177+
href: load-balancer-issues.md
178+
172179
- name: Reference
173180
items:
174181
- name: Azure CLI

AKS-Arc/aks-create-clusters-cli.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ description: Learn how to create Kubernetes clusters in Azure Local using Azure
44
ms.topic: how-to
55
ms.custom: devx-track-azurecli
66
author: sethmanheim
7-
ms.date: 02/18/2025
7+
ms.date: 03/31/2025
88
ms.author: sethm
99
ms.lastreviewed: 01/25/2024
1010
ms.reviewer: guanghu
@@ -44,7 +44,7 @@ az extension add -n connectedk8s --upgrade
4444

4545
## Create a Kubernetes cluster
4646

47-
Use the `az aksarc create` command to create a Kubernetes cluster in AKS Arc. Make sure you sign in to Azure before running this command. If you have multiple Azure subscriptions, select the appropriate subscription ID using the [az account set](/cli/azure/account#az-account-set) command.
47+
Use the [`az aksarc create`](/cli/azure/aksarc#az-aksarc-create) command to create a Kubernetes cluster in AKS Arc. Make sure you sign in to Azure before you run this command. If you have multiple Azure subscriptions, select the appropriate subscription ID using the [`az account set`](/cli/azure/account#az-account-set) command. With the `az aksarc create` command, we recommend that you use the `--validate` flag, which validates the input parameters that you intend to use. Once the input parameters are validated, you can run the `az aksarc create` command without the `--validate` flag to create the Kubernetes cluster.
4848

4949
```azurecli
5050
az aksarc create -n $aksclustername -g $resource_group --custom-location $customlocationID --vnet-ids $logicnetId --aad-admin-group-object-ids $aadgroupID --generate-ssh-keys

AKS-Arc/aks-edge-howto-scale-out.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,12 @@ Now that AKS Edge Essentials is installed on your primary machine, this article
1515
> [!CAUTION]
1616
> Scaling to additional nodes is an experimental feature.
1717
18+
> [!IMPORTANT]
19+
> AKS Edge Essentials multi-machine deployment is currently in PREVIEW.
20+
> See the [Supplemental Terms of Use for Microsoft Azure Previews](https://azure.microsoft.com/support/legal/preview-supplemental-terms/) for legal terms that apply to Azure features that are in beta, preview, or otherwise not yet released into general availability.
21+
> Azure Kubernetes Service Edge Essentials previews are partially covered by customer support on a best-effort basis.
22+
23+
1824
## Prerequisites
1925

2026
- Set up your [scalable Kubernetes](aks-edge-howto-multi-node-deployment.md) cluster.

AKS-Arc/aks-networks.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ title: Create logical networks for Kubernetes clusters on Azure Local, version 2
33
description: Learn how to create Arc-enabled logical networks for AKS.
44
ms.topic: how-to
55
author: sethmanheim
6-
ms.date: 03/21/2025
6+
ms.date: 04/01/2025
77
ms.author: sethm
88
ms.lastreviewed: 04/01/2024
99
ms.reviewer: abha
@@ -80,7 +80,7 @@ For static IP, the required parameters are as follows:
8080
| `--ip-allocation-method` | The IP address allocation method. Supported values are `Static`. Usage: `--ip-allocation-method "Static"`. |
8181
| `--ip-pool-start` | The start IP address of your IP pool. The address must be in range of the address prefix. Usage: `--ip-pool-start "10.220.32.18"`. |
8282
| `--ip-pool-end` | The end IP address of your IP pool. The address must be in range of the address prefix. Usage: `--ip-pool-end "10.220.32.38"`. |
83-
| `--vlan` | The VLAN ID. Usage: `--vlan 10`. This parameter is required, otherwise the default value of 0 results in an AKS Arc cluster creation failure. |
83+
| `--vlan` | The VLAN ID. Usage: `--vlan 10`. This parameter is optional. Specifies the VLAN ID (an int32 value) to use when creating the logical network. |
8484

8585
# [Azure portal](#tab/azureportal)
8686

AKS-Arc/aks-troubleshoot.md

Lines changed: 26 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -3,10 +3,10 @@ title: Troubleshoot common issues in AKS enabled by Azure Arc
33
description: Learn about common issues and workarounds in AKS enabled by Arc.
44
ms.topic: how-to
55
author: sethmanheim
6-
ms.date: 02/28/2025
6+
ms.date: 04/01/2025
77
ms.author: sethm
8-
ms.lastreviewed: 02/27/2024
9-
ms.reviewer: guanghu
8+
ms.lastreviewed: 04/01/2025
9+
ms.reviewer: abha
1010

1111
---
1212

@@ -20,18 +20,29 @@ To open a support request, see the [Get support](/azure/aks/hybrid/help-support)
2020

2121
## Known issues
2222

23-
The following sections describe known issues and workarounds for AKS enabled by Azure Arc:
24-
25-
- [Control plane configuration validation errors](control-plane-validation-errors.md)
26-
- [Connectivity issues with MetalLB](load-balancer-issues.md)
27-
- [K8sVersionValidation error](cluster-k8s-version.md)
28-
- [Use diagnostic checker](aks-arc-diagnostic-checker.md)
29-
- [KubeAPIServer unreachable error](kube-api-server-unreachable.md)
30-
- [Can't see VM SKUs on Azure portal](check-vm-sku.md)
31-
- [Deleted AKS Arc cluster still visible on Azure portal](deleted-cluster-visible.md)
32-
- [Can't fully delete AKS Arc cluster with PodDisruptionBudget (PDB) resources](delete-cluster-pdb.md)
33-
- [Azure Advisor upgrade recommendation message](azure-advisor-upgrade.md)
34-
- [Issues after deleting storage volume](delete-storage-volume.md)
23+
The following sections describe known issues for AKS enabled by Azure Arc:
24+
25+
| AKS Arc CRUD operation | Issue | Fix status |
26+
|------------------------|-------|------------|
27+
| AKS cluster create | [Can't create AKS cluster or scale node pool because of issues with AKS Arc images](gallery-image-not-usable.md) | Partially fixed in 2503 release |
28+
| AKS steady state | [AKS Arc telemetry pod consumes too much memory and CPU](telemetry-pod-resources.md) | Active
29+
| AKS steady state | [Disk space exhaustion on control plane VMs due to accumulation of kube-apiserver audit logs](kube-apiserver-log-overflow.md) | Active
30+
| AKS cluster delete | [Deleted AKS Arc cluster still visible on Azure portal](deleted-cluster-visible.md) | Active |
31+
| AKS cluster delete | [Can't fully delete AKS Arc cluster with PodDisruptionBudget (PDB) resources](delete-cluster-pdb.md) | Fixed in 2503 release |
32+
| Azure portal | [Can't see VM SKUs on Azure portal](check-vm-sku.md) | Fixed in 2411 release |
33+
| MetalLB Arc extension | [Connectivity issues with MetalLB](load-balancer-issues.md) | Fixed in 2411 release |
34+
35+
36+
## Guides to diagnose and troubleshoot Kubernetes CRUD failures
37+
38+
| AKS Arc operation | Issue |
39+
|------------------------|-------|
40+
| Create validation | [Control plane configuration validation errors](control-plane-validation-errors.md)
41+
| Create validation | [K8sVersionValidation error](cluster-k8s-version.md)
42+
| Create validation | [KubeAPIServer unreachable error](kube-api-server-unreachable.md)
43+
| Network configuration issues | [Use diagnostic checker](aks-arc-diagnostic-checker.md)
44+
| Kubernetes steady state | [Resolve issues due to out-of-band deletion of storage volumes](delete-storage-volume.md)
45+
| Release validation | [Azure Advisor upgrade recommendation message](azure-advisor-upgrade.md)
3546

3647
## Next steps
3748

AKS-Arc/aks-whats-new-23h2.md

Lines changed: 19 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,11 +2,11 @@
22
title: What's new in AKS on Azure Local, version 23H2
33
description: Learn about what's new in AKS on Azure Local, version 23H2.
44
ms.topic: overview
5-
ms.date: 11/19/2024
5+
ms.date: 04/01/2025
66
author: sethmanheim
77
ms.author: sethm
88
ms.reviewer: guanghu
9-
ms.lastreviewed: 06/25/2024
9+
ms.lastreviewed: 04/01/2025
1010

1111
---
1212

@@ -42,6 +42,23 @@ By integrating these components, Azure Arc offers a unified and efficient Kubern
4242

4343
This section lists the new features and improvements in AKS Arc in each release of Azure Local, version 23H2.
4444

45+
### Release 2503
46+
47+
The following Kubernetes cluster deployment and management capabilities are available:
48+
49+
- **Large VM SKUs for Kubernetes nodepools**: Added two new VM SKUs - `Standard_D32s_v3`: 32 vCPU, 128 GiB and `Standard_D16s_v3`: 16 vCPU, 64 GiB - to support larger nodepools on an AKS cluster. For more information about supported VM sizes, see [supported scale options](scale-requirements.md).
50+
- **Improved log collection experience**: Improved log collection for AKS control plane node VMs and nodepool VMs, with support for passing multiple IP addresses and SSH key or directory path. For more information, see [on-demand log collection](get-on-demand-logs.md) and [az aksarc get-logs CLI](/cli/azure/aksarc#az-aksarc-get-logs).
51+
- **Improved diagnosability**: The [Diagnostic Checker tool](aks-arc-diagnostic-checker.md) is automatically run in case of Kubernetes cluster create failure, and added new test cases.
52+
- **Improved Kubernetes cluster delete**: Fixed deletion issues; for example, due to [pod disruption budgets](delete-cluster-pdb.md?tabs=aks-on-azure-local).
53+
- **Improved AKS Arc image download**: Fixed issues with AKS Arc image downloads.
54+
- **Improved GPU support**: Improved error handling for Kubernetes cluster creation with GPU enabled nodepools. Fixed known issues with attaching persistent volumes on GPU enabled nodepools.
55+
56+
To get started with these features in the 2503 release, make sure to update your [AKSArc CLI extension](/cli/azure/aksarc) to version 1.5.37 or higher.
57+
58+
#### Supported Kubernetes versions for 2503
59+
60+
The Kubernetes versions supported in the 2503 release are: 1.28.12, 1.28.14, 1.29.7, 1.29.9, 1.30.3 and 1.30.4.
61+
4562
### Release 2411
4663

4764
The following Kubernetes cluster deployment and management capabilities are available:

AKS-Arc/delete-cluster-pdb.md

Lines changed: 10 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ description: Learn how to troubleshoot when deleted workload cluster resources c
44
ms.topic: troubleshooting
55
author: sethmanheim
66
ms.author: sethm
7-
ms.date: 12/12/2024
7+
ms.date: 04/01/2025
88
ms.reviewer: leslielin
99

1010
---
@@ -15,7 +15,15 @@ ms.reviewer: leslielin
1515

1616
When you delete an AKS Arc cluster that has [PodDisruptionBudget](https://kubernetes.io/docs/tasks/run-application/configure-pdb/) (PDB) resources, the deletion might fail to remove the PDB resources. By default, PDB is installed in the workload identity-enabled AKS Arc cluster.
1717

18-
## Workaround
18+
## Mitigation
19+
20+
This issue was fixed in [AKS on Azure Local, version 2503](aks-whats-new-23h2.md#release-2503).
21+
22+
- **For deleting an AKS cluster** with a PodDisruptionBudget: If you're on an older build, please update to Azure Local, version 2503. Once you update to 2503, you can retry deleting the AKS cluster. File a support case if you're on the 2503 release and your AKS cluster is not deleted after at least one retry.
23+
- **For deleting a nodepool** with a PodDisruptionBudget: By design, the nodepool isn't deleted if a PodDisruptionBudget exists, to protect applications. Use the following workaround to delete the PDB resources and then retry deleting the nodepool.
24+
25+
26+
## Workaround for AKS Edge Essentials and older versions of AKS on Azure Local
1927

2028
Before you delete the AKS Arc cluster, access the AKS Arc cluster's **kubeconfig** and delete all PDBs:
2129

AKS-Arc/gallery-image-not-usable.md

Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,54 @@
1+
---
2+
title: Kubernetes cluster create or nodepool scale failing due to AKS Arc image issues
3+
description: Learn about a known issue with Kubernetes cluster create or nodepool scale failing due to AKS Arc VHD image download issues.
4+
ms.topic: troubleshooting
5+
author: sethmanheim
6+
ms.author: sethm
7+
ms.date: 04/01/2025
8+
ms.reviewer: abha
9+
10+
---
11+
12+
# Can't create AKS cluster or scale node pool because of issues with AKS Arc images
13+
14+
[!INCLUDE [hci-applies-to-23h2](includes/hci-applies-to-23h2.md)]
15+
16+
## Symptoms
17+
18+
You see the following error when you try to create the AKS cluster:
19+
20+
```output
21+
Kubernetes version 1.29.4 is not ready for use on Linux. Please go to https://aka.ms/aksarccheckk8sversions for details of how to check the readiness of Kubernetes versions.
22+
```
23+
24+
You might also see the following error when you try to scale a nodepool:
25+
26+
```output
27+
error with code NodepoolPrecheckFailed occured: AksHci nodepool creation precheck failed. Detailed message: 1 error occurred:\n\t* rpc error: code = Unknown desc = GalleryImage not usable, health state degraded: Degraded
28+
```
29+
30+
When you run `az aksarc get-versions`, you see the following errors:
31+
32+
```output
33+
...
34+
              {
35+
36+
                "errorMessage": "failed cloud-side provisioning image linux-cblmariner-0.4.1.11203 to cloud gallery: {\n  \"code\": \"ImageProvisionError\",\n  \"message\": \"force failed to deprovision existing gallery image: failed to delete gallery image linux-cblmariner-0.4.1.11203: rpc error: code = Unknown desc = sa659p1012: rpc error: code = Unavailable desc = connection error: desc = \\\"transport: Error while dialing: dial tcp 10.202.244.4:45000: connectex: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.\\\"\",\n  \"additionalInfo\": [\n   {\n    \"type\": \"providerImageProvisionInfo\",\n    \"info\": {\n     \"ProviderDownload\": \"True\"\n    }\n   }\n  ],\n  \"category\": \"\"\n }",
37+
                "osSku": "CBLMariner",
38+
                "osType": "Linux",
39+
                "ready": false
40+
              },
41+
...
42+
```
43+
44+
## Mitigation
45+
46+
- This issue was fixed in [AKS on Azure Local, version 2503](aks-whats-new-23h2.md#release-2503).
47+
- Upgrade your Azure Local deployment to the 2503 build.
48+
- Once updated, confirm that the images have been downloaded successfully by running the `az aksarc get-versions` command.
49+
- For new AKS clusters: new AKS clusters should now be created successfully.
50+
- For scaling existing AKS clusters: scaling existing AKS clusters continues to encounter issues. Please file a support case.
51+
52+
## Next steps
53+
54+
[Known issues in AKS enabled by Azure Arc](aks-known-issues.md)

AKS-Arc/get-on-demand-logs.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -28,13 +28,13 @@ Before log collection, you must have the SSH key you obtained when you created t
2828
You can collect logs using IPs or the `kubeconfig` parameter. If an IP is used, it collects the log from a particular node. If `kubeconfig` is used, it collects logs from all cluster nodes. This command generates a .zip file on the local disk. For other parameters, see the [Az CLI reference](/cli/azure/aksarc/logs#az-aksarc-logs-hci).
2929

3030
```azurecli
31-
az aksarc logs hci --ip 192.168.200.25 --credentials-dir ./.ssh --out-dir ./logs
31+
az aksarc get-logs --ip 192.168.200.25 --credentials-dir ./.ssh --out-dir ./logs
3232
```
3333

3434
Or
3535

3636
```azurecli
37-
az aksarc logs hci --kubeconfig ./.kube/config --credentials-dir ./.ssh --out-dir ./logs
37+
az aksarc get-logs --kubeconfig ./.kube/config --credentials-dir ./.ssh --out-dir ./logs
3838
```
3939

4040
## Send logs to Microsoft Support
Lines changed: 73 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,73 @@
1+
---
2+
title: Disk space exhaustion on the control plane VMs due to accumulation of kube-apiserver audit logs
3+
description: Learn about a known issue with disk space exhaustion on the control plane VMs due to accumulation of kube-apiserver audit logs.
4+
ms.topic: troubleshooting
5+
author: sethmanheim
6+
ms.author: sethm
7+
ms.date: 04/01/2025
8+
ms.reviewer: abha
9+
10+
---
11+
12+
# Disk space exhaustion on control plane VMs due to accumulation of kube-apiserver audit logs
13+
14+
[!INCLUDE [hci-applies-to-23h2](includes/hci-applies-to-23h2.md)]
15+
16+
## Symptoms
17+
18+
If you're running kubectl commands and facing issues, you might see errors such as:
19+
20+
```output
21+
kubectl get ns
22+
Error from server (InternalError): an error on the server ("Internal Server Error: \"/api/v1/namespaces?limit=500\": unknown") has prevented the request from succeeding (get namespaces)
23+
```
24+
25+
When you SSH into the control plane VM, you might notice that your control plane VM ran out of disk space, specifically on the **/dev/sda2** partition. This is due to the accumulation of kube-apiserver audit logs in the **/var/log/kube-apiserver** directory, which can consume approximately 90 GB of disk space.
26+
27+
```output
28+
clouduser@moc-laiwyj6tly6 [ /var/log/kube-apiserver ]$ df -h
29+
Filesystem      Size  Used Avail Use% Mounted on
30+
devtmpfs        4.0M     0  4.0M   0% /dev
31+
tmpfs           3.8G   84K  3.8G   1% /dev/shm
32+
tmpfs           1.6G  179M  1.4G  12% /run
33+
tmpfs           4.0M     0  4.0M   0% /sys/fs/cgroup
34+
/dev/sda2        99G   99G     0 100% /
35+
tmpfs           3.8G     0  3.8G   0% /tmp
36+
tmpfs           769M     0  769M   0% /run/user/1002
37+
clouduser@moc-laiwyj6tly6 [ /var/log/kube-apiserver ]$ sudo ls -l /var/log/kube-apiserver|wc -l
38+
890
39+
clouduser@moc-laiwyj6tly6 [ /var/log/kube-apiserver ]$ sudo du -h /var/log/kube-apiserver
40+
87G     /var/log/kube-apiserver
41+
```
42+
43+
The issue occurs because the `--audit-log-maxbackup` value is set to 0. This setting allows the audit logs to accumulate without any limit, eventually filling up the disk.
44+
45+
## Mitigation
46+
47+
To resolve the issue temporarily, you must manually clean up the old audit logs. Follow these steps:
48+
49+
- SSH into the control plane virtual machine (VM) of your AKS Arc cluster.
50+
- Remove the old audit logs from the **/var/log/kube-apiserver** folder.
51+
- If you have multiple control plane nodes, you must repeat this process on each control plane VM.
52+
53+
[SSH into the control plane VM](ssh-connect-to-windows-and-linux-worker-nodes.md) and navigate to the kube-apiserver logs directory:
54+
55+
```bash
56+
cd /var/log/kube-apiserver
57+
```
58+
59+
Remove the old audit log files:
60+
61+
```bash
62+
rm audit-*.log
63+
```
64+
65+
Exit the SSH session:
66+
67+
```bash
68+
exit
69+
```
70+
71+
## Next steps
72+
73+
[Known issues in AKS enabled by Azure Arc](aks-known-issues.md)

0 commit comments

Comments
 (0)