|
| 1 | +--- |
| 2 | +title: Pod Sandboxing (preview) with Azure Kubernetes Service (AKS) |
| 3 | +description: Learn about and deploy Pod Sandboxing (preview), also referred to as Kernel Isolation, on an Azure Kubernetes Service (AKS) cluster. |
| 4 | +services: container-service |
| 5 | +ms.topic: article |
| 6 | +ms.date: 02/23/2023 |
| 7 | + |
| 8 | +--- |
| 9 | + |
| 10 | +# Pod Sandboxing (preview) with Azure Kubernetes Service (AKS) |
| 11 | + |
| 12 | +To help secure and protect your container workloads from untrusted or potentially malicious code, AKS now includes a mechanism called Pod Sandboxing (preview). Pod Sandboxing provides an isolation boundary between the container application, and the shared kernel and compute resources of the container host. For example CPU, memory, and networking. Pod Sandboxing complements other security measures or data protection controls with your overall architecture to help you meet regulatory, industry, or governance compliance requirements for securing sensitive information. |
| 13 | + |
| 14 | +This article helps you understand this new feature, and how to implement it. |
| 15 | + |
| 16 | +## Prerequisites |
| 17 | + |
| 18 | +- The Azure CLI version 2.44.1 or later. Run `az --version` to find the version, and run `az upgrade` to upgrade the version. If you need to install or upgrade, see [Install Azure CLI][install-azure-cli]. |
| 19 | + |
| 20 | +- The `aks-preview` Azure CLI extension version 0.5.123 or later to select the [Mariner operating system][mariner-cluster-config] generation 2 SKU. |
| 21 | + |
| 22 | +- Register the `KataVMIsolationPreview` feature in your Azure subscription. |
| 23 | + |
| 24 | +- AKS supports Pod Sandboxing (preview) on version 1.24.0 and higher. |
| 25 | + |
| 26 | +- To manage a Kubernetes cluster, use the Kubernetes command-line client [kubectl][kubectl]. Azure Cloud Shell comes with `kubectl`. You can install kubectl locally using the [az aks install-cli][az-aks-install-cmd] command. |
| 27 | + |
| 28 | +### Install the aks-preview Azure CLI extension |
| 29 | + |
| 30 | +[!INCLUDE [preview features callout](includes/preview/preview-callout.md)] |
| 31 | + |
| 32 | +To install the aks-preview extension, run the following command: |
| 33 | + |
| 34 | +```azurecli |
| 35 | +az extension add --name aks-preview |
| 36 | +``` |
| 37 | + |
| 38 | +Run the following command to update to the latest version of the extension released: |
| 39 | + |
| 40 | +```azurecli |
| 41 | +az extension update --name aks-preview |
| 42 | +``` |
| 43 | + |
| 44 | +### Register the KataVMIsolationPreview feature flag |
| 45 | + |
| 46 | +Register the `KataVMIsolationPreview` feature flag by using the [az feature register][az-feature-register] command, as shown in the following example: |
| 47 | + |
| 48 | +```azurecli-interactive |
| 49 | +az feature register --namespace "Microsoft.ContainerService" --name "KataVMIsolationPreview" |
| 50 | +``` |
| 51 | + |
| 52 | +It takes a few minutes for the status to show *Registered*. Verify the registration status by using the [az feature show][az-feature-show] command: |
| 53 | + |
| 54 | +```azurecli-interactive |
| 55 | +az feature show --namespace "Microsoft.ContainerService" --name "KataVMIsolationPreview" |
| 56 | +``` |
| 57 | + |
| 58 | +When the status reflects *Registered*, refresh the registration of the *Microsoft.ContainerService* resource provider by using the [az provider register][az-provider-register] command: |
| 59 | + |
| 60 | +```azurecli-interactive |
| 61 | +az provider register --namespace "Microsoft.ContainerService" |
| 62 | +``` |
| 63 | + |
| 64 | +## Limitations |
| 65 | + |
| 66 | +The following are constraints with this preview of Pod Sandboxing (preview): |
| 67 | + |
| 68 | +* Kata containers may not reach the IOPS performance limits that traditional containers can reach on Azure Files and high performance local SSD. |
| 69 | + |
| 70 | +* [Microsoft Defender for Containers][defender-for-containers] doesn't support assessing Kata runtime pods. |
| 71 | + |
| 72 | +* [Container Insights][container-insights] doesn't support monitoring of Kata runtime pods in the preview release. |
| 73 | + |
| 74 | +* [Kata][kata-network-limitations] host-network isn't supported. |
| 75 | + |
| 76 | +* AKS does not support [Container Storage Interface drivers][csi-storage-driver] and [Secrets Store CSI driver][csi-secret-store driver] in this preview release. |
| 77 | + |
| 78 | +## How it works |
| 79 | + |
| 80 | +To achieve this functionality on AKS, [Kata Containers][kata-containers-overview] running on Mariner AKS Container Host (MACH) stack delivers hardware-enforced isolation. Pod Sandboxing extends the benefits of hardware isolation such as a separate kernel for each Kata pod. Hardware isolation allocates resources for each pod and doesn't share them with other Kata Containers or namespace containers running on the same host. |
| 81 | + |
| 82 | +The solution architecture is based on the following components: |
| 83 | + |
| 84 | +* [Mariner][mariner-overview] AKS Container Host |
| 85 | +* Microsoft Hyper-V Hypervisor |
| 86 | +* Azure-tuned Dom0 Linux Kernel |
| 87 | +* Open-source [Cloud-Hypervisor][cloud-hypervisor] Virtual Machine Monitor (VMM) |
| 88 | +* Integration with [Kata Container][kata-container] framework |
| 89 | + |
| 90 | +Deploying Pod Sandboxing using Kata Containers is similar to the standard containerd workflow to deploy containers. The deployment includes kata-runtime options that you can define in the pod template. |
| 91 | + |
| 92 | +To use this feature with a pod, the only difference is to add **runtimeClassName** *kata-mshv-vm-isolation* to the pod spec. |
| 93 | + |
| 94 | +When a pod uses the *kata-mshv-vm-isolation* runtimeClass, it creates a VM to serve as the pod sandbox to host the containers. The VM's default memory is 2 GB and the default CPU is one core if the [Container resource manifest][container-resource-manifest] (`containers[].resources.limits`) doesn't specify a limit for CPU and memory. When you specify a limit for CPU or memory in the container resource manifest, the VM has `containers[].resources.limits.cpu` with the `1` argument to use *one + xCPU*, and `containers[].resources.limits.memory` with the `2` argument to specify *2 GB + yMemory*. Containers can only use CPU and memory to the limits of the containers. The `containers[].resources.requests` are ignored in this preview while we work to reduce the CPU and memory overhead. |
| 95 | + |
| 96 | +## Deploy new cluster |
| 97 | + |
| 98 | +Perform the following steps to deploy an AKS Mariner cluster using the Azure CLI. |
| 99 | + |
| 100 | +1. Create an AKS cluster using the [az aks create][az-aks-create] command and specifying the following parameters: |
| 101 | + |
| 102 | + * **--workload-runtime**: Specify *KataMshvVmIsolation* to enable the Pod Sandboxing feature on the node pool. With this parameter, these other parameters shall satisfy the following requirements. Otherwise, the command fails and reports an issue with the corresponding parameter(s). |
| 103 | + * **--os-sku**: *mariner*. Only the Mariner os-sku supports this feature in this preview release. |
| 104 | + * **--node-vm-size**: Any Azure VM size that is a generation 2 VM and supports nested virtualization works. For example, [Dsv3][dv3-series] VMs. |
| 105 | + |
| 106 | + The following example creates a cluster named *myAKSCluster* with one node in the *myResourceGroup*: |
| 107 | + |
| 108 | + ```azurecli |
| 109 | + az aks create --name myAKSCluster --resource-group myResourceGroup --os-sku mariner --workload-runtime KataMshvVmIsolation --node-vm-size Standard_D4s_v3 --node-count 1 |
| 110 | +
|
| 111 | +2. Run the following command to get access credentials for the Kubernetes cluster. Use the [az aks get-credentials][aks-get-credentials] command and replace the values for the cluster name and the resource group name. |
| 112 | +
|
| 113 | + ```azurecli |
| 114 | + az aks get-credentials --resource-group myResourceGroup --name myAKSCluster |
| 115 | + ``` |
| 116 | +
|
| 117 | +3. List all Pods in all namespaces using the [kubectl get pods][kubectl-get-pods] command. |
| 118 | +
|
| 119 | + ```bash |
| 120 | + kubectl get pods --all-namespaces |
| 121 | + ``` |
| 122 | +
|
| 123 | +## Deploy to an existing cluster |
| 124 | +
|
| 125 | +To use this feature with an existing AKS cluster, the following requirements must be met: |
| 126 | +
|
| 127 | +* Follow the steps to [register the KataVMIsolationPreview][register-the-katavmisolationpreview-feature-flag] feature flag. |
| 128 | +* Verify the cluster is running Kubernetes version 1.24.0 and higher. |
| 129 | +
|
| 130 | +Use the following command to enable Pod Sandboxing (preview) by creating a node pool to host it. |
| 131 | +
|
| 132 | +1. Add a node pool to your AKS cluster using the [az aks nodepool add][az-aks-nodepool-add] command. Specify the following parameters: |
| 133 | +
|
| 134 | + * **--resource-group**: Enter the name of an existing resource group to create the AKS cluster in. |
| 135 | + * **--cluster-name**: Enter a unique name for the AKS cluster, such as *myAKSCluster*. |
| 136 | + * **--name**: Enter a unique name for your clusters node pool, such as *nodepool2*. |
| 137 | + * **--workload-runtime**: Specify *KataMshvVmIsolation* to enable the Pod Sandboxing feature on the node pool. Along with the `--workload-runtime` parameter, these other parameters shall satisfy the following requirements. Otherwise, the command fails and reports an issue with the corresponding parameter(s). |
| 138 | + * **--os-sku**: *mariner*. Only the Mariner os-sku supports this feature in the preview release. |
| 139 | + * **--node-vm-size**: Any Azure VM size that is a generation 2 VM and supports nested virtualization works. For example, [Dsv3][dv3-series] VMs. |
| 140 | +
|
| 141 | + The following example adds a node pool to *myAKSCluster* with one node in *nodepool2* in the *myResourceGroup*: |
| 142 | +
|
| 143 | + ```azurecli |
| 144 | + az aks nodepool add --cluster-name myAKSCluster --resource-group myResourceGroup --name nodepool2 --os-sku mariner --workload-runtime KataMshvVmIsolation --node-vm-size Standard_D4s_v3 |
| 145 | + ``` |
| 146 | +
|
| 147 | +2. Run the [az aks update][az-aks-update] command to enable pod sandboxing (preview) on the cluster. |
| 148 | +
|
| 149 | + ```bash |
| 150 | + az aks update --name myAKSCluster --resource-group myResourceGroup |
| 151 | + ``` |
| 152 | +
|
| 153 | +## Deploy a trusted application |
| 154 | +
|
| 155 | +To demonstrate the isolation of an application on the AKS cluster, perform the following steps. |
| 156 | +
|
| 157 | +1. Create a file named *trusted-app.yaml* to describe a trusted pod, and then paste the following manifest. |
| 158 | +
|
| 159 | + ```yml |
| 160 | + kind: Pod |
| 161 | + apiVersion: v1 |
| 162 | + metadata: |
| 163 | + name: trusted |
| 164 | + spec: |
| 165 | + containers: |
| 166 | + - name: trusted |
| 167 | + image: mcr.microsoft.com/aks/fundamental/base-ubuntu:v0.0.11 |
| 168 | + command: ["/bin/sh", "-ec", "while :; do echo '.'; sleep 5 ; done"] |
| 169 | + ``` |
| 170 | +
|
| 171 | +2. Deploy the Kubernetes pod by running the [kubectl apply][kubectl-apply] command and specify your *trusted-app.yaml* file: |
| 172 | +
|
| 173 | + ```bash |
| 174 | + kubectl apply -f trusted-app.yaml |
| 175 | + ``` |
| 176 | +
|
| 177 | + The output of the command resembles the following example: |
| 178 | +
|
| 179 | + ```output |
| 180 | + pod/trusted created |
| 181 | + ``` |
| 182 | +
|
| 183 | +## Deploy an untrusted application |
| 184 | +
|
| 185 | +To demonstrate the deployed application on the AKS cluster isn't isolated and is on the untrusted shim, perform the following steps. |
| 186 | +
|
| 187 | +1. Create a file named *untrusted-app.yaml* to describe an untrusted pod, and then paste the following manifest. |
| 188 | +
|
| 189 | + ```yml |
| 190 | + kind: Pod |
| 191 | + apiVersion: v1 |
| 192 | + metadata: |
| 193 | + name: untrusted |
| 194 | + spec: |
| 195 | + runtimeClassName: kata-mshv-vm-isolation |
| 196 | + containers: |
| 197 | + - name: untrusted |
| 198 | + image: mcr.microsoft.com/aks/fundamental/base-ubuntu:v0.0.11 |
| 199 | + command: ["/bin/sh", "-ec", "while :; do echo '.'; sleep 5 ; done"] |
| 200 | + ``` |
| 201 | +
|
| 202 | + The value for **runtimeClassNameSpec** is `kata-mhsv-vm-isolation`. |
| 203 | +
|
| 204 | +2. Deploy the Kubernetes pod by running the [kubectl apply][kubectl-apply] command and specify your *untrusted-app.yaml* file: |
| 205 | +
|
| 206 | + ```bash |
| 207 | + kubectl apply -f untrusted-app.yaml |
| 208 | + ``` |
| 209 | +
|
| 210 | + The output of the command resembles the following example: |
| 211 | +
|
| 212 | + ```output |
| 213 | + pod/untrusted created |
| 214 | + ``` |
| 215 | +
|
| 216 | +## Verify Kernel Isolation configuration |
| 217 | +
|
| 218 | +1. To access a container inside the AKS cluster, start a shell session by running the [kubectl exec][kubectl-exec] command. In this example, you're accessing the container inside the *untrusted* pod. |
| 219 | +
|
| 220 | + ```bash |
| 221 | + kubectl exec -it untrusted -- /bin/bash |
| 222 | + ``` |
| 223 | +
|
| 224 | + Kubectl connects to your cluster, runs `/bin/sh` inside the first container within the *untrusted* pod, and forward your terminal's input and output streams to the container's process. You can also start a shell session to the container hosting the *trusted* pod. |
| 225 | +
|
| 226 | +2. After starting a shell session to the container of the *untrusted* pod, you can run commands to verify that the *untrusted* container is running in a pod sandbox. You'll notice that it has a different kernel version compared to the *trusted* container outside the sandbox. |
| 227 | +
|
| 228 | + To see the kernel version run the following command: |
| 229 | +
|
| 230 | + ```bash |
| 231 | + uname -r |
| 232 | + ``` |
| 233 | +
|
| 234 | + The following example resembles output from the pod sandbox kernel: |
| 235 | +
|
| 236 | + ```output |
| 237 | + root@untrusted:/# uname -r |
| 238 | + 5.15.48.1-8.cm2 |
| 239 | + ``` |
| 240 | +
|
| 241 | +3. Start a shell session to the container of the *trusted* pod to verify the kernel output: |
| 242 | +
|
| 243 | + ```bash |
| 244 | + kubectl exec -it trusted -- /bin/bash |
| 245 | + ``` |
| 246 | +
|
| 247 | + To see the kernel version run the following command: |
| 248 | +
|
| 249 | + ```bash |
| 250 | + uname -r |
| 251 | + ``` |
| 252 | +
|
| 253 | + The following example resembles output from the VM that is running the *trusted* pod, which is a different kernel than the *untrusted* pod running within the pod sandbox: |
| 254 | +
|
| 255 | + ```output |
| 256 | + 5.15.80.mshv2-hvl1.m2 |
| 257 | +
|
| 258 | +## Cleanup |
| 259 | +
|
| 260 | +When you're finished evaluating this feature, to avoid Azure charges, clean up your unnecessary resources. If you deployed a new cluster as part of your evaluation or testing, you can delete the cluster using the [az aks delete][az-aks-delete] command. |
| 261 | +
|
| 262 | +```azurecli |
| 263 | +az aks delete --resource-group myResourceGroup --name myAKSCluster |
| 264 | +``` |
| 265 | + |
| 266 | +If you enabled Pod Sandboxing (preview) on an existing cluster, you can remove the pod(s) using the [kubectl delete pod][kubectl-delete-pod] command. |
| 267 | + |
| 268 | +```bash |
| 269 | +kubectl delete pod pod-name |
| 270 | +``` |
| 271 | + |
| 272 | +## Next steps |
| 273 | + |
| 274 | +* Learn more about [Azure Dedicated hosts][azure-dedicated-hosts] for nodes with your AKS cluster to use hardware isolation and control over Azure platform maintenance events. |
| 275 | + |
| 276 | +<!-- EXTERNAL LINKS --> |
| 277 | +[kata-containers-overview]: https://katacontainers.io/ |
| 278 | +[kubectl]: https://kubernetes.io/docs/user-guide/kubectl/ |
| 279 | +[azurerm-mariner]: https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/kubernetes_cluster_node_pool#os_sku |
| 280 | +[kubectl-get-pods]: https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands#get |
| 281 | +[kubectl-exec]: https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands#exec |
| 282 | +[container-resource-manifest]: https://kubernetes.io/docs/tasks/configure-pod-container/assign-cpu-resource/ |
| 283 | +[kubectl-delete-pod]: https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands#delete |
| 284 | +[kubectl-apply]: https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands#apply |
| 285 | +[kata-network-limitations]: https://github.com/kata-containers/kata-containers/blob/main/docs/Limitations.md#host-network |
| 286 | +[cloud-hypervisor]: https://www.cloudhypervisor.org |
| 287 | +[kata-container]: https://katacontainers.io |
| 288 | + |
| 289 | +<!-- INTERNAL LINKS --> |
| 290 | +[install-azure-cli]: /cli/azu |
| 291 | +[az-feature-register]: /cli/azure/feature#az_feature_register |
| 292 | +[az-provider-register]: /cli/azure/provider#az-provider-register |
| 293 | +[az-feature-show]: /cli/azure/feature#az-feature-show |
| 294 | +[aks-get-credentials]: /cli/azure/aks#az-aks-get-credentials |
| 295 | +[az-aks-create]: /cli/azure/aks#az-aks-create |
| 296 | +[az-deployment-group-create]: /cli/azure/deployment/group#az-deployment-group-create |
| 297 | +[connect-to-aks-cluster-nodes]: node-access.md |
| 298 | +[dv3-series]: ../virtual-machines/dv3-dsv3-series.md#dsv3-series |
| 299 | +[az-aks-nodepool-add]: /cli/azure/aks/nodepool#az-aks-nodepool-add |
| 300 | +[create-ssh-public-key-linux]: ../virtual-machines/linux/mac-create-ssh-keys.md |
| 301 | +[az-aks-delete]: /cli/azure/aks#az-aks-delete |
| 302 | +[cvm-on-aks]: use-cvm.md |
| 303 | +[azure-dedicated-hosts]: use-azure-dedicated-hosts.md |
| 304 | +[container-insights]: ../azure-monitor/containers/container-insights-overview.md |
| 305 | +[defender-for-containers]: ../defender-for-cloud/defender-for-containers-introduction.md |
| 306 | +[az-aks-install-cmd]: /cli/azure/aks#az-aks-install-cli |
| 307 | +[mariner-overview]: use-mariner.md |
| 308 | +[csi-storage-driver]: csi-storage-drivers.md |
| 309 | +[csi-secret-store driver]: csi-secrets-store-driver.md |
| 310 | +[az-aks-update]: /cli/azure/aks#az-aks-update |
| 311 | +[mariner-cluster-config]: cluster-configuration.md#mariner-os |
| 312 | +[register-the-katavmisolationpreview-feature-flag]: #register-the-katavmisolationpreview-feature-flag |
0 commit comments