diff --git a/charts/longhorn/README.md b/charts/longhorn/README.md index 1bae02be..fb26f649 100644 --- a/charts/longhorn/README.md +++ b/charts/longhorn/README.md @@ -27,7 +27,15 @@ Source: ### How to configure disks for LH -As of now, we follow the same approach we use for `/docker` folder (via ansible playbook) but we use `/longhorn` folder name +Manual configuration performed (to be moved to ansible) +1. Create partition on the disk + * e.g. via using `fdisk` https://phoenixnap.com/kb/linux-create-partition +2. Format partition as XFS + * `sudo mkfs.xfs -f /dev/sda1` +3. Mount partition `sudo mount -t xfs /dev/sda1 /longhorn` +4. Persist mount in `/etc/fstab` by adding line + * `UUID= /longhorn xfs pquota 0 0` + * UUID can be received from `lsblk -f` Issue asking LH to clearly document requirements: https://github.com/longhorn/longhorn/issues/11125 @@ -54,3 +62,22 @@ Insights into LH's performance: Resource requirements: * https://github.com/longhorn/longhorn/issues/1691 + +### (Kubernetes) Node maintenance + +https://longhorn.io/docs/1.8.1/maintenance/maintenance/ + +Note: you can use Longhorn GUI to perform some operations + +### Zero downtime updating longhorn disks (procedure) +Notes: +* Update one node at a time so that other nodes can still serve data + +1. Go to LH GUI and select a Node + 1. Disable scheduling + 2. Request eviction +1. Remove disk from the node + * If remove icon is disabled, disable eviction on disk to enable the remove button +2. Perform disks updates on the node +3. Make sure LH didn't pick up wrongly configured disk in the meantime and remove the wrong disk if it did so +4. Wait till LH automatically adds the disk to the Node diff --git a/charts/topolvm/README.md b/charts/topolvm/README.md new file mode 100644 index 00000000..849df697 --- /dev/null +++ b/charts/topolvm/README.md @@ -0,0 +1,43 @@ +## topolvm components and architecture +See diagram https://github.com/topolvm/topolvm/blob/topolvm-chart-v15.5.5/docs/design.md + +## Preqrequisites +`topolvm` does not automatically creates Volume Groups (specified in device-classes). This needs to be configured additionally (e.g. manually, via ansible, ...) + +Manual example (Ubuntu 22.04): +1. Create partition to use later (`sudo fdisk /dev/sda`) +2. Create PV (`sudo pvcreate /dev/sda2`) + * Prerequisite: `sudo apt install lvm2` +3. Create Volume group (`sudo vgcreate topovg-sdd /dev/sda2`) + * Note: Volume group's name must correspond to the setting of `volume-group` inside `lvmd.deviceClasses` +4. Check volume group (`sudo vgdisplay`) + +Source: https://github.com/topolvm/topolvm/blob/topolvm-chart-v15.5.5/docs/getting-started.md#prerequisites + +## Deleting PV(C)s with `retain` reclaim policy +1. Delete release (e.g. helm uninstall -n test test) +2. Find LogicalVolume CR (`kubectl get logicalvolumes.topolvm.io` +3. Delete LogicalVolume CR (`kubectl delete logicalvolumes.topolvm.io `) +4. Delete PV (`kubectl delete PV `) + +## Backup / Snapshotting +1. Only possible while using thin provisioning +2. We use thick (non-thin provisioned) volumes --> no snapshot support + + Track this feature request for changes https://github.com/topolvm/topolvm/issues/1070 + +Note: there might be alternative not documented ways (e.g. via Velero) + +## Resizing PVs +1. Update storage capacity in configuration +2. Deploy changes + +Note: storage size can only be increased. Otherwise, one gets `Forbidden: field can not be less than previous value` error + +## Node maintenance + +Read https://github.com/topolvm/topolvm/blob/topolvm-chart-v15.5.5/docs/node-maintenance.md + +## Using topolvm. Notes +* `topolvm` may not work with pods that define `spec.nodeName` Use node affinity instead + https://github.com/topolvm/topolvm/blob/main/docs/faq.md#the-pod-does-not-start-when-nodename-is-specified-in-the-pod-spec diff --git a/charts/topolvm/values.yaml.gotmpl b/charts/topolvm/values.yaml.gotmpl new file mode 100644 index 00000000..216d54ef --- /dev/null +++ b/charts/topolvm/values.yaml.gotmpl @@ -0,0 +1,106 @@ +lvmd: + # set up lvmd service with DaemonSet + managed: true + + # device classes (VGs) need to be created outside of topolvm (e.g. manually, via ansible, ...) + deviceClasses: + - name: ssd + volume-group: topovg-sdd + default: true + spare-gb: 5 + +storageClasses: + - name: {{ .Values.topolvmStorageClassName }} + storageClass: + # Want to use non-default device class? + # See configuration example in + # https://github.com/topolvm/topolvm/blob/topolvm-chart-v15.5.5/docs/snapshot-and-restore.md#set-up-a-storage-class + + fsType: xfs + isDefaultClass: false + # volumeBindingMode can be either WaitForFirstConsumer or Immediate. WaitForFirstConsumer is recommended because TopoLVM cannot schedule pods wisely if volumeBindingMode is Immediate. + volumeBindingMode: WaitForFirstConsumer + allowVolumeExpansion: true + # NOTE: On removal requires manual clean up of PVs, LVMs + # and Logical Volumes (CR logicalvolumes.topolvm.io). + # Removal Logical Volume (CR) would clean up the LVM on the node, + # but PV has still to be removed manually. + # Read more: https://github.com/topolvm/topolvm/blob/topolvm-chart-v15.5.5/docs/advanced-setup.md#storageclass + reclaimPolicy: Retain + +resources: + topolvm_node: + requests: + memory: 100Mi + cpu: 100m + limits: + memory: 500Mi + cpu: 500m + + topolvm_controller: + requests: + memory: 50Mi + cpu: 50m + limits: + memory: 200Mi + cpu: 200m + + lvmd: + requests: + memory: 100Mi + cpu: 100m + limits: + memory: 500Mi + cpu: 500m + + csi_registrar: + requests: + cpu: 25m + memory: 10Mi + limits: + cpu: 200m + memory: 200Mi + + csi_provisioner: + requests: + memory: 50Mi + cpu: 50m + limits: + memory: 200Mi + cpu: 200m + + csi_resizer: + requests: + memory: 50Mi + cpu: 50m + limits: + memory: 200Mi + cpu: 200m + + csi_snapshotter: + requests: + memory: 50Mi + cpu: 50m + limits: + memory: 200Mi + cpu: 200m + + liveness_probe: + requests: + cpu: 25m + memory: 10Mi + limits: + cpu: 200m + memory: 200Mi + +# https://github.com/topolvm/topolvm/blob/topolvm-chart-v15.5.5/docs/topolvm-scheduler.md +scheduler: + # start simple + enabled: false + +cert-manager: + # start simple + enabled: false + +snapshot: + enabled: true