Skip to content

Commit fb945ad

Browse files
authored
Sync release-hotfixes with main
Sync release-hotfixes with main
2 parents 0b30a99 + 998ed9b commit fb945ad

File tree

56 files changed

+709
-296
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

56 files changed

+709
-296
lines changed

.openpublishing.redirection.json

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1334,6 +1334,11 @@
13341334
"source_path": "azure-stack/hci/manage/use-gpu-with-clustered-vm.md",
13351335
"redirect_url": "/windows-server/virtualization/hyper-v/deploy/use-gpu-with-clustered-vm?pivots=azure-stack-hci&toc=/azure-stack/hci/toc.json&bc=/azure-stack/breadcrumb/toc.json",
13361336
"redirect_document_id": false
1337+
},
1338+
{
1339+
"source_path": "azure-stack/user/vm-update-management.md",
1340+
"redirect_url": "/azure/azure-monitor/agents/agents-overview",
1341+
"redirect_document_id": false
13371342
}
13381343
]
13391344
}

AKS-Hybrid/TOC.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -186,6 +186,8 @@
186186
href: aks-edge-howto-access-tpm.md
187187
- name: Additional configuration
188188
href: aks-edge-howto-more-configs.md
189+
- name: Use GPU acceleration
190+
href: aks-edge-gpu.md
189191
- name: Update AKS Edge Essentials
190192
items:
191193
- name: Update online

AKS-Hybrid/aks-edge-gpu.md

Lines changed: 158 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,158 @@
1+
---
2+
title: GPU acceleration for AKS Edge Essentials
3+
description: Learn how to expose a GPU to the virtual machine's Linux module.
4+
author: sethmanheim
5+
ms.topic: how-to
6+
ms.date: 07/09/2024
7+
ms.author: sethm
8+
ms.lastreviewed: 06/05/2024
9+
ms.reviewer: guanghu
10+
11+
# Intent: As an IT Pro, I want to learn how to expose a GPU to Linux
12+
# Keyword: GPU AKS Edge Essentials
13+
---
14+
15+
# Use GPU acceleration for AKS Edge Essentials
16+
17+
GPUs are a popular choice for artificial intelligence computations, because they offer parallel processing capabilities and can often execute vision-based inferencing faster than CPUs. To better support artificial intelligence and machine learning applications, AKS Edge Essentials can expose a GPU to the virtual machine's Linux module.
18+
19+
AKS Edge Essentials supports *GPU-Paravirtualization* (GPU-PV) as the GPU passthrough technology. In other words, the GPU is shared between the Linux virtual machine and the host.
20+
21+
> [!IMPORTANT]
22+
> These features include components developed and owned by NVIDIA Corporation or its licensors. By using GPU acceleration features, you are accepting and agreeing to the terms of the [NVIDIA End-User License Agreement](https://www.nvidia.com/content/DriverDownload-March2009/licence.php?lang=us).
23+
24+
## Prerequisites
25+
26+
The GPU acceleration of AKS Edge Essentials currently supports a specific set of GPU hardware. Additionally, use of this feature requires specific versions of Windows.
27+
28+
The supported GPUs and required Windows versions are as follows:
29+
30+
| Supported GPUs | GPU passthrough type | Supported Windows versions |
31+
|-----------------------------|--------------------------|-------------------------------------------------|
32+
| NVIDIA GeForce, Quadro, RTX | GPU-PV | Windows 10/11 (Pro, Enterprise, IoT Enterprise) |
33+
34+
> [!IMPORTANT]
35+
> GPU-PV support might be limited to certain generations of processors or GPU architectures, as determined by the GPU vendor. For more information, see the [NVIDIA CUDA for WSL documentation](https://developer.nvidia.com/cuda/wsl).
36+
>
37+
> Windows 10 users must use the November 2021 update build 19044.1620, or later. After installation, you can verify your build version by running `winver` at the command prompt.
38+
>
39+
> GPU passthrough is not supported with nested virtualization, such as when you run AKS Edge Essentials in a Windows virtual machine.
40+
41+
## System setup and installation
42+
43+
The following sections contain setup and installation information.
44+
45+
- For **NVIDIA GeForce/Quadro/RTX GPUs**, download and install the [NVIDIA CUDA-enabled driver for Windows Subsystem for Linux (WSL)](https://developer.nvidia.com/cuda/wsl) to use with your existing CUDA ML workflows. Originally developed for WSL, the CUDA for WSL drivers is also used for AKS Edge Essentials.
46+
- Windows 10 users must also [install WSL](/windows/wsl/install) because some of the libraries are shared between WSL and AKS Edge Essentials.
47+
- Install or upgrade AKS Edge Essentials to the May 2024 release, or later. For more information, see [Update your AKS Edge Essentials clusters](aks-edge-howto-update.md). The GPU-PV is supported on both k8s and k3s Kubernetes distributions.
48+
49+
## Enable GPU-PV in your AKS Edge Essentials deployment
50+
51+
### Step 1: single machine configuration parameters
52+
53+
You can generate the parameters you need to create a single machine cluster and add the necessary GPU-PV configuration parameters using the following commands.
54+
55+
This script only focuses on the GPU-PV configuration; you should also make other necessary general updates according to your own AKS Edge Essentials deployment:
56+
57+
```powershell
58+
$jsonObj = New-AksEdgeConfig -DeploymentType SingleMachineCluster
59+
$jsonObj.User.AcceptGpuWarning = $true
60+
$machine = $jsonObj.Machines[0]
61+
$machine.LinuxNode.GpuPassthrough.Name = "NVIDIA GeForce GTX 1070"
62+
$machine.LinuxNode.GpuPassthrough.Type = "ParaVirtualization"
63+
$machine.LinuxNode.GpuPassthrough.Count = 1
64+
```
65+
66+
The key parameters to enable GPU-PV are:
67+
68+
- `User.AcceptGpuWarning`: Set this parameter to `true` to automatically accept the confirmation message when you enable GPU-PV on AKS Edge Essentials.
69+
- `LinuxNode.GpuPassthrough.Name`: Describes the GPU model that's deployed in this machine, with proper drivers installed.
70+
- `LinuxNode.GpuPassthrough.Type`: Describes the GPU passthrough type. Only `ParaVirtualization` is currently supported.
71+
- `LinuxNode.GpuPassthrough.Count`: Describes how many GPUs are installed on this machine.
72+
73+
### Step 2: create a single machine cluster
74+
75+
1. You can now run the `New-AksEdgeDeployment` cmdlet to deploy a single-machine AKS Edge cluster with a single Linux control plane node. You can use the JSON object generated in [step 1](#step-1-single-machine-configuration-parameters) and pass it as a string:
76+
77+
```powershell
78+
New-AksEdgeDeployment -JsonConfigString (New-AksEdgeConfig | ConvertTo-Json -Depth 4)
79+
```
80+
81+
1. After successful deployment, verify GPU-PV is enabled by running `nvidia-smi`:
82+
83+
:::image type="content" source="media/aks-edge-gpu/gpu-pv-enabled.png" alt-text="Screenshot showing NVIDIA smi output." lightbox="media/aks-edge-gpu/gpu-pv-enabled.png":::
84+
85+
### Step 3: Deploy Nvidia runtimeclass
86+
87+
1. Create a YAML file named **nvidia-runtimeclass.yaml** with the following content:
88+
89+
```yaml
90+
apiVersion: node.k8s.io/v1
91+
kind: RuntimeClass
92+
metadata:
93+
name: nvidia
94+
handler: nvidia
95+
```
96+
97+
1. Deploy the `runtimeclass`:
98+
99+
```powershell
100+
kubectl apply –f nvidia-runtimeclass.yaml
101+
```
102+
103+
### Step 4: Install Nvidia GPU device plugin
104+
105+
1. Download **nvidia-deviceplugin.yaml** from [this GitHub location](https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/v0.14.3/nvidia-device-plugin.yml).
106+
1. Update the container images location in the **nvidia-deviceplugin.yaml** file to the new value, as follows:
107+
108+
```yaml
109+
containers:
110+
- image: registry.gitlab.com/nvidia/kubernetes/device-plugin/staging/k8s-device-plugin:6a31a868
111+
```
112+
113+
1. Install the Nvidia GPU DevicePlugin:
114+
115+
```powershell
116+
kubectl apply –f nvidia-deviceplugin.yaml
117+
```
118+
119+
1. Verify that the plugin is running and the NVIDIA GPU is detected by checking the logs of the **deviceplugin** pod using the `kubectl get pods -A` and `kubectl logs $podname -n kube-system` commands:
120+
121+
:::image type="content" source="media/aks-edge-gpu/kubectl-logs.png" alt-text="Screenshot showing kubectl logs command output." lightbox="media/aks-edge-gpu/kubectl-logs.png":::
122+
123+
## Get started with a sample workload
124+
125+
1. Prepare a workload YAML file named **gpu-workload.yaml**:
126+
127+
```yaml
128+
apiVersion: v1
129+
kind: Pod
130+
metadata:
131+
name: gpu-pod
132+
spec:
133+
restartPolicy: Never
134+
containers:
135+
- name: cuda-container
136+
image: nvcr.io/nvidia/k8s/cuda-sample:vectoradd-cuda10.2
137+
resources:
138+
limits:
139+
nvidia.com/gpu: 1 # requesting 1 GPU
140+
tolerations:
141+
- key: nvidia.com/gpu
142+
operator: Exists
143+
effect: NoSchedule
144+
```
145+
146+
1. Run the sample workload:
147+
148+
```powershell
149+
kubectl apply -f .\gpu-workload.yaml
150+
```
151+
152+
1. Verify that the workload ran successfully:
153+
154+
:::image type="content" source="media/aks-edge-gpu/workload.png" alt-text="Screenshot showing that the workload ran successfully." lightbox="media/aks-edge-gpu/workload.png":::
155+
156+
## Next steps
157+
158+
[AKS Edge Essentials overview](aks-edge-overview.md)

AKS-Hybrid/aks-edge-howto-setup-nested-environment.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -170,7 +170,7 @@ The following guide is an example of IP address allocation. You can use your own
170170
> If you are using an Azure VM, use the **Windows host OS (L0)** DNS server. Use the `ipconfig /all` command to get the DNS server address. Check that you're able to get internet access using your web browser. If you have no access, check the if the DNS server is correctly configured:
171171
172172
```powershell
173-
New-NetIPAddress –IPAddress "172.20.1.2" -DefaultGateway "172.20.1.1" -PrefixLength $ifIndex
173+
New-NetIPAddress –IPAddress "172.20.1.2" -DefaultGateway "172.20.1.1" -PrefixLength "24" -InterfaceIndex $ifIndex
174174
Set-DNSClientServerAddress –InterfaceIndex $ifIndex –ServerAddresses "172.20.1.1"
175175
```
176176

AKS-Hybrid/aks-edge-software-license-terms.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,10 @@
11
---
22
title: Microsoft Software License Terms for AKS Edge Essentials
3-
description: Microsoft Software License Terms for AKS Edge Essentials .
3+
description: Microsoft Software License Terms for AKS Edge Essentials.
44
author: rcheeran
55
ms.author: rcheeran
66
ms.topic: how-to
7-
ms.date: 02/16/2023
7+
ms.date: 07/03/2024
88
ms.custom: template-how-to
99
---
1010

@@ -42,7 +42,7 @@ These license terms are an agreement between you and Microsoft Corporation (or o
4242

4343
**a) Data Collection.** The software may collect information about you and your use of the software, and send that to Microsoft. Microsoft may use this information to provide services and improve our products and services. You may opt-out of many of these scenarios, but not all, as described in the product documentation. There are also some features in the software that may enable you to collect data from users of your applications. If you use these features to enable data collection in your applications, you must comply with applicable law, including providing appropriate notices to users of your applications. You can learn more about data collection and use in the help documentation and the privacy statement at https://go.microsoft.com/fwlink/?LinkId=521839. Your use of the software operates as your consent to these practices.
4444

45-
**b) Processing of Personal Data.** To the extent Microsoft is a processor or subprocessor of personal data in connection with the software, Microsoft makes the commitments in the European Union General Data Protection Regulation Terms of the Online Services Terms to all customers effective May 25, 2018. See [Microsofts GDPR Commitments to Customers of our Generally Available Enterprise Software Products](/legal/gdpr).
45+
**b) Processing of Personal Data.** To the extent Microsoft is a processor or subprocessor of personal data in connection with the software, Microsoft makes the commitments in the European Union General Data Protection Regulation Terms of the Online Services Terms to all customers effective May 25, 2018. See [Microsoft's GDPR Commitments to Customers of our Generally Available Enterprise Software Products](/legal/gdpr).
4646

4747
**4. SCOPE OF LICENSE.** The software is licensed, not sold. Microsoft reserves all other rights. Unless applicable law gives you more rights despite this limitation, you will not (and have no right to):
4848

AKS-Hybrid/aks-edge-system-requirements.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ description: Requirements and supported versions for AKS Edge Essentials.
44
author: rcheeran
55
ms.author: rcheeran
66
ms.topic: conceptual
7-
ms.date: 04/16/2024
7+
ms.date: 07/08/2024
88
ms.custom: template-concept
99
---
1010

@@ -53,13 +53,13 @@ Install Windows 10/11 IoT Enterprise/Enterprise/Pro on your machine and activate
5353
- **Kubernetes Distribution supported**: Kubernetes (K8S) - Version: 1.24.3 and on Kubernetes (K3S) - Version: 1.24.3
5454
- **Deployment options**: Single-machine clusters and full Kubernetes deployment on single machines only. Full deployment across multiple machines isn't supported in GA.
5555
- **Workloads**: Only Linux worker nodes.
56-
- **Network plugins**: Calico on K8S and Flannel on K3S.
56+
- **Network plugins**: Calico on K8S.
5757

5858
## Experimental or prerelease features
5959

6060
- **Deployment options**: Full Kubernetes deployment on multiple machines.
6161
- **Workloads**: Windows worker nodes.
62-
- **Network plugins**: Flannel on K8S and Calico on K3S.
62+
- **Network plugins**: Calico on K3S.
6363

6464
## Next steps
6565

AKS-Hybrid/backup-workload-cluster.md

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ title: Back up, restore workload clusters using Velero
33
description: Learn how to back up and restore workload clusters to Azure Blob Storage or MinIO using Velero in AKS Arc.
44
author: sethmanheim
55
ms.topic: how-to
6-
ms.date: 11/21/2022
6+
ms.date: 07/03/2024
77
ms.author: sethm
88
ms.lastreviewed: 1/14/2022
99
ms.reviewer: scooley
@@ -46,16 +46,17 @@ The procedures in this section describe how to install Velero and use Azure Blob
4646
```
4747

4848
1. Install the [Velero CLI](https://velero.io/docs/v1.9/basic-install/#install-the-cli) by running the following command:
49-
> [!NOTE]
50-
> The flag --use-restic is no longer supported on version velero 1.10+, to be able to use the flag version [1.9.x](https://github.com/vmware-tanzu/velero/releases/tag/v1.9.5) is required
49+
50+
> [!NOTE]
51+
> The `--use-restic` flag isn't supported on Velero version 1.10 and later. The flag is only supported on version [1.9.x](https://github.com/vmware-tanzu/velero/releases/tag/v1.9.5).
5152
5253
```powershell
5354
choco install velero
5455
```
5556

5657
1. If needed, change to the Azure subscription you want to use for the backups.
5758

58-
By default, Velero stores backups in the same Azure subscription as your VMs and disks and won't allow you to restore backups to a resource group in a different subscription. To enable backups and restores across subscriptions, specify a subscription to use for your backups. You can skip this step if you're already in the subscription you want to use for your backups.
59+
By default, Velero stores backups in the same Azure subscription as your VMs and disks and won't allow you to restore backups to a resource group in a different subscription. To enable backup and restore operations across subscriptions, specify a subscription to use for your backups. You can skip this step if you're already in the subscription you want to use for your backups.
5960

6061
Switch to the subscription you want to use for your backups:
6162

@@ -149,7 +150,7 @@ The procedures in this section describe how to install Velero and use Azure Blob
149150
150151
If you want to enable the minimum resource provider actions, create a custom role, and assign that role to the service principal.
151152
152-
1. Create a file named **azure-role.json** with following contents. Substitute your own custom role name and subscription ID.
153+
1. Create a file named **azure-role.json** with following contents. Substitute your own custom role name and subscription ID:
153154
154155
```json
155156
{
@@ -192,7 +193,7 @@ The procedures in this section describe how to install Velero and use Azure Blob
192193
```
193194

194195
> [!NOTE]
195-
> Service principals expire. To find out when your new service principal will expire, run this command: `az ad sp show --id $AZURE_CLIENT_ID`.
196+
> Service principals expire. To find out when your new service principal expires, run this command: `az ad sp show --id $AZURE_CLIENT_ID`.
196197
197198
1. Create a file that contains the variables the Velero installation requires. The command looks similar to the following one:
198199

AKS-Hybrid/concepts-container-networking.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
title: Container networking concepts
33
description: Learn about container networking in AKS enabled by Azure Arc.
44
ms.topic: conceptual
5-
ms.date: 06/24/2024
5+
ms.date: 07/08/2024
66
ms.author: sethm
77
ms.lastreviewed: 05/31/2022
88
ms.reviewer: mikek
@@ -90,6 +90,9 @@ For more information about the Calico Network plug-in and policies, check out [g
9090

9191
#### Flannel
9292

93+
> [!NOTE]
94+
> Flannel CNI was retired in December 2023.
95+
9396
Flannel is a virtual networking layer designed specifically for containers. Flannel creates a flat network that overlays the host network. All containers/pods are assigned one IP address in this overlay network, and communicate directly by connecting to each other's IP address.
9497

9598
#### Calico

0 commit comments

Comments
 (0)