Skip to content

Conversation

Carus11
Copy link
Contributor

@Carus11 Carus11 commented May 17, 2024

Fixes #363

This PR provides functionality to set the OS Disk Type and Kubelet Disk Type options as documented in:
https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/kubernetes_cluster_node_pool#kubelet_disk_type

This gives the user the ability to choose the use the local temporary storage available to some nodes for the Operating system, and or the Kubelet.
The benefit of doing this is the local temporary disk is typically a high performance type of storage medium. If the kubelet is set to use this fast storage, then local emptyDir volume can be used for workloads such as SASWORK. Previously many have used alternatives such as hostPath access methods to this storage which is not often permitted by security policies.

OS Disk type can be either Managed or Ephemeral.
Kubelet Disk type can be either OS or Temporary.
The default behaviour for the option, and when not set, is to use Managed and OS for the OS and Kubelet respectively. When set to Temporary and Ephemeral the local temporary storage will be used if available on the node.

Test 1

Original State. The example tfvars file contains an OS disk of 200GB. This means the kubelet will share this managed disk with the OS, placing both the kubernetes emptyDir volumes, and container image storage on the same disk as the OS.

Name       OsDiskType    KubeletDiskType
---------  ------------  -----------------
system     Managed       OS
compute    Managed       OS
stateless  Managed       OS
cas        Managed       OS
stateful   Managed       OS

Test 2

Functionality added but nothing added to tfvars file

Name       OsDiskType    KubeletDiskType
---------  ------------  -----------------
system     Managed       OS
stateful   Managed       OS
stateless  Managed       OS
compute    Managed       OS
cas        Managed       OS

Test 3

Compute node pool changed to VM which has temporary storage, Standard_E16ds_v5, os disk reduced to 80gb, and both OS and Kubelet disk type switched to their non-standard options

Name       OsDiskType    KubeletDiskType
---------  ------------  -----------------
system     Managed       OS
stateful   Managed       OS
stateless  Managed       OS
cas        Managed       OS
compute    Ephemeral     Temporary

Notice the OS has carved off the 80GB from the (as advertised) 600GB disk. And the kubelet has the remaining space.

root@aks-compute-37429462-vmss000000:/# lsblk -f
NAME    FSTYPE FSVER LABEL           UUID                                 FSAVAIL FSUSE% MOUNTPOINTS
sda
|-sda1  ext4   1.0   cloudimg-rootfs 2c316610-0ba2-47f0-8c70-5162294156ed   56.2G    27% /
|-sda14
`-sda15 vfat   FAT32 UEFI            51A8-FB00                              98.3M     6% /boot/efi
sdb
`-sdb1  ext4   1.0                   e2269d85-dc0c-4cae-b7b1-4577761f17f7  483.7G     0% /var/lib/kubelet
                                                                                         /mnt
sr0

Test 4

Compute OS type set to ephemeral, and kubelet set to OS disk

Name       OsDiskType    KubeletDiskType
---------  ------------  -----------------
system     Managed       OS
compute    Ephemeral     OS
stateful   Managed       OS
stateless  Managed       OS
cas        Managed       OS

It looks like both kubelet and the OS use the same OS disk size, squeezing it all into the 80GB which is carved off from the local temporary disk.

root@aks-compute-34891476-vmss000000:/# lsblk -f
NAME    FSTYPE FSVER LABEL           UUID                                 FSAVAIL FSUSE% MOUNTPOINTS
sda
|-sda1  ext4   1.0   cloudimg-rootfs 2c316610-0ba2-47f0-8c70-5162294156ed   57.2G    26% /var/lib/kubelet
|                                                                                        /
|-sda14
`-sda15 vfat   FAT32 UEFI            51A8-FB00                              98.3M     6% /boot/efi
sdb
`-sdb1  ext4   1.0                   89f157fc-0328-4eac-a8ff-53aaa01fd11c  484.8G     0% /mnt
sr0

Nothing in /mnt

root@aks-compute-34891476-vmss000000:/# ls /mnt
DATALOSS_WARNING_README.txt  lost+found

Test 5

Set the OS Disk to Managed, and the Kubelet disk to Temporary

Name       OsDiskType    KubeletDiskType
---------  ------------  -----------------
system     Managed       OS
stateful   Managed       OS
stateless  Managed       OS
cas        Managed       OS
compute    Managed       Temporary

We can see that kubelet is on sdb1, and it has almost all of the local temporary disk for use.

root@aks-compute-27846506-vmss000000:/# lsblk -f
NAME    FSTYPE FSVER LABEL           UUID                                 FSAVAIL FSUSE% MOUNTPOINTS
sda
|-sda1  ext4   1.0   cloudimg-rootfs 2c316610-0ba2-47f0-8c70-5162294156ed   56.2G    27% /
|-sda14
`-sda15 vfat   FAT32 UEFI            51A8-FB00                              98.3M     6% /boot/efi
sdb
`-sdb1  ext4   1.0                   8a41e4e7-38ce-43b7-b7d9-48565e46c4b9  558.5G     0% /var/lib/kubelet
                                                                                         /mnt
sr0

Carus11 added 2 commits May 17, 2024 14:04
OS Disk type can be either Managed or Temporary.
Kubelet Disk type can be either OS or Ephemeral.
The default behaviour for the option, and when not set, is to use
Managed and OS for the OS and Kubelet respectively. When set to
Temporary and Ephemeral the local temporary storage will be used if
available on the node.

Signed-off-by: Carus Kyle <[email protected]>
@riragh
Copy link
Member

riragh commented May 20, 2024

@Carus11, Thank you for providing the PR and your tests. However, until the feature request is accepted this PR will not be merged.

@Carus11 Carus11 changed the base branch from main to staging January 7, 2025 15:52
@sas-grtoma sas-grtoma self-assigned this Mar 28, 2025
Copy link

This PR is stale because it has been open 30 days with no activity.

@github-actions github-actions bot added the stale Open for 30 days with no activity label Apr 28, 2025
@saschjmil
Copy link
Contributor

Hey @Carus11. Would you be willing to update these variables to align with the community_config_vars.md on the staging branch? The config variables should start with community_ and should be documented in that community_config_vars.md file.

As this is one of our first community contributed features, please let me know if there are ways we can make the process smoother.

@saschjmil saschjmil added enhancement New feature or request and removed stale Open for 30 days with no activity labels Apr 29, 2025
@Carus11 Carus11 marked this pull request as draft April 30, 2025 13:04
@Carus11 Carus11 force-pushed the kubelet_disk_type_temp branch from a8c21be to e6d7f67 Compare May 8, 2025 09:58
@Carus11 Carus11 marked this pull request as ready for review May 8, 2025 09:59
@saschjmil saschjmil deleted the branch sassoftware:main May 8, 2025 18:30
@saschjmil saschjmil closed this May 8, 2025
@saschjmil saschjmil reopened this May 8, 2025
@saschjmil saschjmil changed the base branch from staging to main May 8, 2025 18:43
@saschjmil
Copy link
Contributor

Hey @Carus11, sorry for closing and re-opening your PR. I deleted the staging branch, which closed all PR's targeting it. I've updated your PR to point to the main branch.

Copy link

This PR is stale because it has been open 30 days with no activity.

@github-actions github-actions bot added the stale Open for 30 days with no activity label Jun 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request stale Open for 30 days with no activity
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feature Request: add kubelet_disk_type to compute node pools
4 participants