Skip to content

Commit cfc66a6

Browse files
authored
Sync release-hotfixes with main
Sync release-hotfixes with main
2 parents fe4701f + ab82246 commit cfc66a6

24 files changed

+293
-164
lines changed

AKS-Hybrid/TOC.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -98,6 +98,8 @@
9898
href: cluster-labels.md
9999
- name: Taints
100100
href: aks-arc-use-node-taints.md
101+
- name: Use availability sets
102+
href: availability-sets.md
101103
- name: Scale a Kubernetes cluster
102104
href: auto-scale-aks-arc.md
103105
- name: Upgrade Kubernetes clusters

AKS-Hybrid/aks-arc-diagnostic-checker.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -98,7 +98,7 @@ $urlArray = @(
9898
"https://management.azure.com",
9999
"https://eastus.dp.kubernetesconfiguration.azure.com",
100100
"https://login.microsoftonline.com",
101-
"https://eastus.log in.microsoft.com",
101+
"https://eastus.login.microsoft.com",
102102
"https://login.windows.net",
103103
"https://mcr.microsoft.com",
104104
"https://gbl.his.arc.azure.com",

AKS-Hybrid/aks-edge-howto-multi-node-deployment.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,9 @@ You can generate the parameters needed to create a scalable cluster using the fo
2727
New-AksEdgeConfig -DeploymentType ScalableCluster -outFile .\aksedge-config.json | Out-Null
2828
```
2929

30-
This creates a configuration file called **aksedge-config.json** which includes the configuration needed to create a scalable cluster with a Linux node. The file is created in your current working directory. See the following examples for more options for creating the configuration file. A detailed description of the configuration parameters [is available here](aks-edge-deployment-config-json.md).
30+
This command creates a configuration file named **aksedge-config.json**, which includes the configuration needed to create a scalable cluster with a Linux node. The file is created in your current working directory. See [Deployment JSON configuration](aks-edge-deployment-config-json.md) for detailed descriptions of the configuration parameters (such as proxy settings).
31+
32+
See the following examples for more options for creating the configuration file.
3133

3234
The key parameters to note for a scalable Kubernetes deployment are:
3335

AKS-Hybrid/aks-edge-howto-single-node-deployment.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ New-AksEdgeConfig -DeploymentType SingleMachineCluster -outFile .\aksedge-config
2626

2727
This command creates a configuration file called **aksedge-config.json** that includes the configuration needed to create a single-machine cluster with a Linux node. The file is created in your current working directory. See the following examples for more options for creating the configuration file.
2828

29-
A detailed description of the configuration parameters [is available here](aks-edge-deployment-config-json.md).
29+
See [Deployment JSON configuration](aks-edge-deployment-config-json.md) for a detailed description of the configuration parameters.
3030

3131
The key parameters for single machine deployment are:
3232

@@ -90,6 +90,9 @@ $machine.LinuxNode.MemoryInMB = 4096
9090
New-AksEdgeDeployment -JsonConfigString ($jsonObj | ConvertTo-Json -Depth 4)
9191
```
9292

93+
> [!TIP]
94+
> See [Deployment JSON configuration](aks-edge-deployment-config-json.md) for all available options, including network settings such as proxy settings.
95+
9396
### Create a simple cluster with NodePort service
9497

9598
You can create a simple cluster with no service IPs (`ServiceIPRangeSize` set as 0):

AKS-Hybrid/aks-edge-quickstart.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,9 @@ ms.custom: template-how-to
1212

1313
This quickstart describes how to set up an Azure Kubernetes Service (AKS) Edge Essentials single-machine K3S Linux-only cluster.
1414

15+
> [!NOTE]
16+
> The purpose of this quickstart script is to ease the evaluation process. This script requires higher level credentials for a one-time setup of the Azure subscription. This is not intended for production use.
17+
1518
## Prerequisites
1619

1720
- See the [system requirements](aks-edge-system-requirements.md). For this quickstart, ensure that you have a minimum of 4.5 GB RAM free, 4 vCPUs and 20 GB free disk space.
@@ -53,6 +56,9 @@ This script automates the following steps:
5356
- Installs the Azure Connected Machine Agent and connects the host machine to Arc for Servers.
5457
- Connects the deployed cluster to Arc for connected Kubernetes.
5558

59+
> [!TIP]
60+
> You can modify the `$aksedgeConfig` json content in the `AksEdgeQuickStart` script for any additional options. For available options, see [Deployment JSON configuration](aks-edge-deployment-config-json.md).
61+
5662
## Step 2: Deploy AKS Edge Essentials
5763

5864
In an elevated PowerShell prompt, run the `AksEdgeQuickStart.ps1` script. This brings up a K3s cluster:

AKS-Hybrid/aks-whats-new-23h2.md

Lines changed: 22 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
title: What's new in AKS on Azure Stack HCI version 23H2
33
description: Learn about what's new in AKS on Azure Stack HCI version 23H2.
44
ms.topic: overview
5-
ms.date: 06/25/2024
5+
ms.date: 09/05/2024
66
author: sethmanheim
77
ms.author: sethm
88
ms.reviewer: guanghu
@@ -42,21 +42,39 @@ By integrating these components, Azure Arc offers a unified and efficient Kubern
4242

4343
This section lists the new features and improvements in AKS Arc in each release of Azure Stack HCI, version 23H2.
4444

45+
### Release 2408
46+
47+
The following Kubernetes cluster deployment and management capabilities are available:
48+
49+
- **High availability improvements**. You can now deploy nodes with anti-affinity on specific physical hosts on Azure Stack HCI clusters. For more information, see [Availability sets](availability-sets.md).
50+
- **PowerShell**. You can now use PowerShell cmdlets to manage your AKS Arc clusters on Azure Stack HCI 23H2 with CRUD support. For more information, see the [PowerShell reference documentation](/powershell/module/az.aksarc/?view=azps-12.1.0&preserve-view=true).
51+
- **Error report improvements**. You can now get improved error case reporting with prechecks; for example, a check for incorrect Kubernetes versions and available GPU capacity.
52+
- **Support for NVIDIA T4**. You can now create node pools in new VM sizes with GPU NVIDIA T4. For more information, see [Use GPUs](deploy-gpu-node-pool.md).
53+
- **Arc Gateway**. You can now use the Arc Gateway to configure very few of the outbound URLs to use AKS clusters on Azure Stack HCI.
54+
- **Support pod CIDR**. You can now create an AKS Arc cluster with a user-specified pod CIDR IP arrange.
55+
56+
#### Supported component versions for 2408
57+
58+
| Component | Version |
59+
|---------------------|--------------------------------------------------|
60+
| AKS Arc | 1.3.218 |
61+
| Kubernetes versions | 1.27.7, 1.27.9, 1.28.5, 1.28.9, 1.29.2, 1.29.4 |
62+
4563
### Release 2405
4664

4765
The following Kubernetes cluster deployment and management capabilities are available:
4866

4967
- **Azure RBAC support**. You can now enable Azure RBAC for Kubernetes while creating AKS Arc clusters using Azure CLI and Azure Resource Manager templates.
50-
- **Taints and labels update**. You can can now update taints and labels during an AKS Arc node pool update operation using Azure CLI and Azure Resource Manager templates.
68+
- **Taints and labels update**. You can now update taints and labels during an AKS Arc node pool update operation using Azure CLI and Azure Resource Manager templates.
5169
- **AKS Arc cluster platform metrics and alerts**. You can now view AKS Arc cluster metrics (platform metrics) and create metric-based alerts.
5270
- **Auto cert repair**. You can now automatically repair certificates, managed by cert-tattoo, that expired when the cluster was shut down.
5371
- **Download VHDs manually (offline download)**. This feature, now available only in private preview, enables you to download virtual machine images and upload them to the target center using a manual process. This can help in environments in which downloading large files is inconvenient. If you are interested in using it, contact your Microsoft Account representative.
5472

55-
#### Supported component versions
73+
#### Supported component versions for 2405
5674

5775
The following component versions are supported in release 2405:
5876

59-
| Component | 2024-06-11 |
77+
| Component | Version |
6078
|---------------------|--------------------------------------------------|
6179
| AKS Arc | 1.0.23.10605 |
6280
| Kubernetes versions | 1.26.10, 1.26.12, 1.27.7, 1.27.9, 1.28.3, 1.28.5 |

AKS-Hybrid/availability-sets.md

Lines changed: 74 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,74 @@
1+
---
2+
title: Availability sets in AKS enabled by Azure Arc
3+
description: Learn how to enable availability sets in AKS enabled by Arc to improve the availability and distribution of your Kubernetes workloads.
4+
ms.topic: how-to
5+
author: sethmanheim
6+
ms.date: 09/06/2024
7+
ms.author: sethm
8+
ms.reviewer: rbaziwane
9+
ms.lastreviewed: 08/15/2024
10+
11+
---
12+
13+
# Availability sets in AKS enabled by Azure Arc
14+
15+
*Availability sets* are logical groups of VMs that have weak anti-affinity relationships with each other, to ensure that they are spread evenly across the available fault domains in a physical cluster. A fault domain in this context is a physical host or a group of physical hosts. By using availability sets, AKS Arc can improve the availability and distribution of your Kubernetes workloads. Availability sets can avoid scenarios in which a single node failure can cause multiple VMs to go down or become unbalanced.
16+
17+
## Overview
18+
19+
If you use AKS on Azure Stack HCI and Windows Server to run Kubernetes workloads on-premises, you might encounter some challenges with the current architecture. For example, you might notice that multiple virtual machines (VMs) within the same node pool can exist on the same physical host, which is not ideal for high availability. Or, you might see that VMs do not rebalance across physical hosts when a host recovers from an issue, resulting in uneven distribution of workloads. These issues can affect the performance and reliability of your applications, causing unnecessary disruption in your business operations.
20+
21+
Availability sets offer several benefits for AKS on Azure Stack HCI and Windows Server users, such as:
22+
23+
- Improves the availability and resilience of your applications by avoiding scenarios in which multiple VMs within the same node pool or control plane go down or become unbalanced due to a single node failure.
24+
- Optimizes the resource usage and performance of your cluster by ensuring that VMs are evenly distributed across the available nodes and not concentrated on a single node or a subset of nodes.
25+
- Aligns with the best practices and expectations of your customers and partners who are looking for a reliable and consistent on-premises Kubernetes experience.
26+
27+
## Enable availability sets
28+
29+
With AKS on Azure Stack HCI 23H2, the availability sets feature is enabled by default when you create a new node pool.
30+
31+
With AKS on Azure Stack HCI 22H2, the availability sets feature is disabled by default. To enable it, add the `-enableAvailabilitySet` parameter when you create a workload cluster. For example:
32+
33+
```powershell
34+
New-AksHciCluster -Name <name> -controlPlaneNodeCount 3 -osType Linux -kubernetesVersion $kubernetesVersion -enableAvailabilitySet
35+
```
36+
37+
## How availability sets work in AKS enabled by Azure Arc
38+
39+
When you create a new AKS Arc cluster, AKS Arc automatically creates availability sets, one for the control plane VMs and another one for each of the node pools in the cluster. Each node pool has its own availability set. With this layout, AKS Arc ensures that VMs of the same role (control plane or node pool) are never located on the same physical host and that they're distributed across the available nodes in a cluster.
40+
41+
Once the availability sets are created and the VMs assigned, the system automatically places them on the appropriate physical nodes. If a node fails, the system also automatically fails over the VMs to other nodes and rebalances them when the node recovers. This way, you can achieve high availability and optimal distribution of your Kubernetes workloads without manual intervention.
42+
43+
Consider an AKS on Azure Stack HCI 23H2 cluster with two physical host machines, **Host A** and **Host B**, three control plane VMs, and two worker node VMs, **Nodepool1VM1** and **Nodepool1VM2**. To ensure high availability of your Kubernetes applications, the node pool VMs must never share the same host, unless one of the hosts is temporarily unavailable for planned maintenance or capacity issue, which can cause the VM (virtual machine) to be temporarily placed on an alternative host.
44+
45+
In the following diagram, each color represents an anti-affinity group:  
46+
47+
:::image type="content" source="media/availability-sets/anti-affinity-1.png" alt-text="Diagram showing hosts in anti-affinity group.":::
48+
49+
If **Host B** goes down due to a reboot, **Control Plane VM2**, **Control Plane VM3**, and **Nodepool1VM2** fail over to **Host A** as shown in the following figure. Assuming your application is running pods in **NodePoolVM1**, this reboot has no impact on your application:
50+
51+
:::image type="content" source="media/availability-sets/anti-affinity-2.png" alt-text="Diagram showing host B down.":::
52+
53+
In the old architecture, if **Host B** came back online after a reboot, there was no guarantee that the VMs would move back from Host A to Host B (rebalancing), thus forcing the workloads to stay on the same host, and create a single point of failure, as shown in the following diagram:
54+
55+
:::image type="content" source="media/availability-sets/anti-affinity-3.png" alt-text="Diagram showing no rebalancing.":::
56+
57+
Availability sets for AKS Arc can help to rebalance VMs once a host recovers from temporary outage. In this example, **ControlPlaneVM2**, **ControlPlaneVM3**, and **Nodepool1VM2** automatically move to **Host B**, as shown here:
58+
59+
:::image type="content" source="media/availability-sets/anti-affinity-1.png" alt-text="Diagram showing hosts in anti-affinity group.":::
60+
61+
> [!IMPORTANT]
62+
> Availability sets in AKS Arc are a new feature that's still evolving and improving. We do not yet support manual configuration of the fault domains or availability sets. You can't change the fault domains of an availability set after it's created. VMs are assigned to an availability set at cluster creation, and can't be migrated to a different availability set.
63+
64+
## Add or delete machines
65+
66+
In a host deletion scenario, the host is no longer considered a part of the cluster. This deletion typically occurs when you replace a machine due to hardware issues, or scale down the HCI cluster for other reasons. During a node outage, the node remains part of the HCI cluster but appears as **Down**.
67+
68+
If a physical machine (fault domain) is permanently deleted from the cluster, the availability set configuration isn't modified to reduce the number of fault domains. In this scenario, the availability set enters an unhealthy state. We recommend that you redeploy your workload clusters so that the availability set is updated with the proper number of fault domains.
69+
70+
When a new physical machine (fault domain) is added to the cluster, the availability set configuration is automatically expanded to include the new machine. However, the existing VMs don't rebalance to apply this new configuration, since they are already assigned to availability sets. We recommend that you redeploy your workload clusters so that the availability set is updated with the proper number of fault domains.
71+
72+
## Next steps
73+
74+
[AKS overview](aks-overview.md)

AKS-Hybrid/deploy-gpu-node-pool.md

Lines changed: 10 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -29,12 +29,20 @@ The following GPU models are supported by AKS on Azure Stack HCI 23H2:
2929
|--------------|-----------|-------------------|
3030
| NVidia | A2 | 2311.2 |
3131
| NVidia | A16 | 2402.0 |
32+
| NVidia | T4 | 2408.0 |
3233

3334
## Supported VM sizes
3435

3536
The following VM sizes for each GPU models are supported by AKS on Azure Stack HCI 23H2.
3637

37-
### The Nvidia A2 is supported by NC2 A2 SKUs
38+
### Nvidia T4 is supported by NK T4 SKUs
39+
40+
| VM size | GPUs | GPU Memory: GiB | vCPU | Memory: GiB |
41+
|-----------------|---|----|-----|----|
42+
| Standard_NK6 | 1 | 8 | 6 | 12 |
43+
| Standard_NK12 | 2 | 16 | 12 | 24 |
44+
45+
### Nvidia A2 is supported by NC2 A2 SKUs
3846

3947
| VM size | GPUs | GPU Memory: GiB | vCPU | Memory: GiB |
4048
|-------------------|---|----|----|----|
@@ -43,7 +51,7 @@ The following VM sizes for each GPU models are supported by AKS on Azure Stack H
4351
| Standard_NC16_A2 | 2 | 48 | 16 | 64 |
4452
| Standard_NC32_A2 | 2 | 48 | 32 | 28 |
4553

46-
### The Nvidia A16 is supported by NC2 A16 SKUs
54+
### Nvidia A16 is supported by NC2 A16 SKUs
4755

4856
| VM size | GPUs | GPU Memory: GiB | vCPU | Memory: GiB |
4957
|--------------------|---|----|----|----|
17 KB
Loading
20.4 KB
Loading

0 commit comments

Comments
 (0)