Kubernetes: add local storage (#1100)

YuryHrytsuk · web-flow · commit de1e17e91f57 · 2025-07-03T16:14:24.000+02:00
* Update longhorn README

Document how to perform (kubernetes) node maintenance

* Update Longhorn README: disks config and maintenance

* Kubernets add local storage

Use topolvm as the most mature local storage csi.

* Update longhorn readme
diff --git a/charts/longhorn/README.md b/charts/longhorn/README.md
@@ -27,7 +27,15 @@ Source:
 
 ### How to configure disks for LH
 
-As of now, we follow the same approach we use for `/docker` folder (via ansible playbook) but we use `/longhorn` folder name
+Manual configuration performed (to be moved to ansible)
+1. Create partition on the disk
+    * e.g. via using `fdisk` https://phoenixnap.com/kb/linux-create-partition
+2. Format partition as XFS
+    * `sudo mkfs.xfs -f /dev/sda1`
+3. Mount partition `sudo mount -t xfs /dev/sda1 /longhorn`
+4. Persist mount in `/etc/fstab` by adding line
+    * `UUID=<partition's uuid> /longhorn xfs pquota 0 0`
+    * UUID can be received from `lsblk -f`
 
 Issue asking LH to clearly document requirements: https://github.com/longhorn/longhorn/issues/11125
 
@@ -54,3 +62,22 @@ Insights into LH's performance:
 
 Resource requirements:
 * https://github.com/longhorn/longhorn/issues/1691
+
+### (Kubernetes) Node maintenance
+
+https://longhorn.io/docs/1.8.1/maintenance/maintenance/
+
+Note: you can use Longhorn GUI to perform some operations
+
+### Zero downtime updating longhorn disks (procedure)
+Notes:
+* Update one node at a time so that other nodes can still serve data
+
+1. Go to LH GUI and select a Node
+    1. Disable scheduling
+    2. Request eviction
+1. Remove disk from the node
+    * If remove icon is disabled, disable eviction on disk to enable the remove button
+2. Perform disks updates on the node
+3. Make sure LH didn't pick up wrongly configured disk in the meantime and remove the wrong disk if it did so
+4. Wait till LH automatically adds the disk to the Node
diff --git a/charts/topolvm/README.md b/charts/topolvm/README.md
@@ -0,0 +1,43 @@
+## topolvm components and architecture
+See diagram https://github.com/topolvm/topolvm/blob/topolvm-chart-v15.5.5/docs/design.md
+
+## Preqrequisites
+`topolvm` does not automatically creates Volume Groups (specified in device-classes). This needs to be configured additionally (e.g. manually, via ansible, ...)
+
+Manual example (Ubuntu 22.04):
+1. Create partition to use later (`sudo fdisk /dev/sda`)
+2. Create PV (`sudo pvcreate /dev/sda2`)
+    * Prerequisite: `sudo apt install lvm2`
+3. Create Volume group (`sudo vgcreate topovg-sdd /dev/sda2`)
+    * Note: Volume group's name must correspond to the setting of `volume-group` inside `lvmd.deviceClasses`
+4. Check volume group (`sudo vgdisplay`)
+
+Source: https://github.com/topolvm/topolvm/blob/topolvm-chart-v15.5.5/docs/getting-started.md#prerequisites
+
+## Deleting PV(C)s with `retain` reclaim policy
+1. Delete release (e.g. helm uninstall -n test test)
+2. Find LogicalVolume CR (`kubectl get logicalvolumes.topolvm.io`
+3. Delete LogicalVolume CR (`kubectl delete logicalvolumes.topolvm.io <lv-name>`)
+4. Delete PV (`kubectl delete PV <pv-name>`)
+
+## Backup / Snapshotting
+1. Only possible while using thin provisioning
+2. We use thick (non-thin provisioned) volumes --> no snapshot support
+
+   Track this feature request for changes https://github.com/topolvm/topolvm/issues/1070
+
+Note: there might be alternative not documented ways (e.g. via Velero)
+
+## Resizing PVs
+1. Update storage capacity in configuration
+2. Deploy changes
+
+Note: storage size can only be increased. Otherwise, one gets `Forbidden: field can not be less than previous value` error
+
+## Node maintenance
+
+Read https://github.com/topolvm/topolvm/blob/topolvm-chart-v15.5.5/docs/node-maintenance.md
+
+## Using topolvm. Notes
+* `topolvm` may not work with pods that define `spec.nodeName` Use node affinity instead
+  https://github.com/topolvm/topolvm/blob/main/docs/faq.md#the-pod-does-not-start-when-nodename-is-specified-in-the-pod-spec
diff --git a/charts/topolvm/values.yaml.gotmpl b/charts/topolvm/values.yaml.gotmpl
@@ -0,0 +1,106 @@
+lvmd:
+  # set up lvmd service with DaemonSet
+  managed: true
+
+  # device classes (VGs) need to be created outside of topolvm (e.g. manually, via ansible, ...)
+  deviceClasses:
+    - name: ssd
+      volume-group: topovg-sdd
+      default: true
+      spare-gb: 5
+
+storageClasses:
+  - name: {{ .Values.topolvmStorageClassName }}
+    storageClass:
+      # Want to use non-default device class?
+      # See configuration example in
+      # https://github.com/topolvm/topolvm/blob/topolvm-chart-v15.5.5/docs/snapshot-and-restore.md#set-up-a-storage-class
+
+      fsType: xfs
+      isDefaultClass: false
+      # volumeBindingMode can be either WaitForFirstConsumer or Immediate. WaitForFirstConsumer is recommended because TopoLVM cannot schedule pods wisely if volumeBindingMode is Immediate.
+      volumeBindingMode: WaitForFirstConsumer
+      allowVolumeExpansion: true
+      # NOTE: On removal requires manual clean up of PVs, LVMs
+      # and Logical Volumes (CR logicalvolumes.topolvm.io).
+      # Removal Logical Volume (CR) would clean up the LVM on the node,
+      # but PV has still to be removed manually.
+      # Read more: https://github.com/topolvm/topolvm/blob/topolvm-chart-v15.5.5/docs/advanced-setup.md#storageclass
+      reclaimPolicy: Retain
+
+resources:
+  topolvm_node:
+   requests:
+     memory: 100Mi
+     cpu: 100m
+   limits:
+     memory: 500Mi
+     cpu: 500m
+
+  topolvm_controller:
+   requests:
+     memory: 50Mi
+     cpu: 50m
+   limits:
+     memory: 200Mi
+     cpu: 200m
+
+  lvmd:
+   requests:
+     memory: 100Mi
+     cpu: 100m
+   limits:
+     memory: 500Mi
+     cpu: 500m
+
+  csi_registrar:
+    requests:
+      cpu: 25m
+      memory: 10Mi
+    limits:
+      cpu: 200m
+      memory: 200Mi
+
+  csi_provisioner:
+   requests:
+     memory: 50Mi
+     cpu: 50m
+   limits:
+     memory: 200Mi
+     cpu: 200m
+
+  csi_resizer:
+    requests:
+      memory: 50Mi
+      cpu: 50m
+    limits:
+      memory: 200Mi
+      cpu: 200m
+
+  csi_snapshotter:
+    requests:
+      memory: 50Mi
+      cpu: 50m
+    limits:
+      memory: 200Mi
+      cpu: 200m
+
+  liveness_probe:
+    requests:
+      cpu: 25m
+      memory: 10Mi
+    limits:
+      cpu: 200m
+      memory: 200Mi
+
+# https://github.com/topolvm/topolvm/blob/topolvm-chart-v15.5.5/docs/topolvm-scheduler.md
+scheduler:
+  # start simple
+  enabled: false
+
+cert-manager:
+  # start simple
+  enabled: false
+
+snapshot:
+  enabled: true