This directory houses the code that transforms raw bare-metal machines into functional Kubernetes clusters. The code in this directory depends on a LAN-accessible Proxmox VE installation (or multiple!) to create a bare-bones Kubernetes cluster using Talos.
-
Terraform installed (check the providers file for the specific version requirements)
-
Install Proxmox VE v8.0+ on a bare-metal machine (or more than one).
-
Ensure you have a file named
aws-credentialsin thetalosdirectory in the format:[default] aws_access_key_id = <redacted> aws_secret_access_key = <redacted>This provides authentication to the AWS S3 bucket backend that stores the TF state AND the KMS secret to decrypt the sops file.
-
Populate
main.tfas desired. -
Run the desired Terraform commands (i.e.
terraform plan,terraform apply). -
Run this command to create the per-node Talos configs:
./create_talos_node_configs
-
Apply (or dry-run apply) to the desired Talos cluster nodes:
# Dry run talosctl apply-config -n skrillex --file ./nodes/skrillex.yaml --dry-run # Apply talosctl apply-config -n skrillex --file ./nodes/skrillex.yaml
-
Follow the Kubernetes bootstrapping steps defined here
- If the disk is already formatted, you'll need to 'zap' it to remove formatting to be able to add it to an LVM-Thin Pool. To do so:
sgdisk --zap-all /dev/<disk>. To find the disk path, uselsblk.- The disk will be formatted if, for example, if was previously used in a storage cluster.
- Navigate to relevant node in Proxmox GUI -> Disks -> LVM-Thin -> "Create: Thinpool"
- Select new disk by block device name (
lsblkmight help show available nodes on the node). Example names:/dev/sda,/dev/sdb,/dev/sdc. - Give the disk a name. I've chosen to increment the disks by the bay #. Example: Bay #3 ->
disk3 - Hit "Create".
EZPZ. Once the disk is available on the node:
kubectl -n storage rollout restart deploy/rook-ceph-operator-
Identify the OSD(s) to remove:
kubectl -n storage exec -it deploy/rook-ceph-tools -- ceph osd tree -
Mark the OSD out (starts data rebalancing away from it):
kubectl -n storage exec -it deploy/rook-ceph-tools -- ceph osd out osd.<ID>
-
Wait for rebalancing to complete:
kubectl -n storage exec -it deploy/rook-ceph-tools -- ceph -w # Or check status periodically # Wait until HEALTH_OK or at least no recovery/rebalancing in progress. kubectl -n storage exec -it deploy/rook-ceph-tools -- ceph status
-
Purge the OSD (removes it from the cluster):
kubectl -n storage exec -it deploy/rook-ceph-tools -- ceph osd purge osd.<ID> --yes-i-really-mean-it
-
Delete the OSD deployment:
kubectl -n storage delete deploy rook-ceph-osd-<ID>
-
If the disk was explicitly listed in CephCluster CR, update the spec to remove it, otherwise Rook may try to recreate the OSD.
-
Clean the disk (if reusing or decommissioning):
# From rook-tools or the node itself kubectl -n storage exec -it deploy/rook-ceph-tools -- ceph-volume lvm zap /dev/sdX --destroy
Tip: If removing multiple OSDs, do them one at a time and wait for full rebalancing between each to minimize risk and cluster load.
Can perform a rolling upgrade with the TF provider and a standard TF workflow (or talosctl also works per the docs).
The Talos TF provider is relatively under-featured for upgrades (Example of lack of features here). So it's best to use talosctl and follow the more production-ready upgrade path here.
There are some Kubernetes configurations, such as the kube-proxy configuration, which talosctl manages but only touches during Kubernetes bootstraps/upgrades. Here is a good example. To update a resource whose state is entirely within Kubernetes, but the config is managed via Talos, refer to the upgrading Kubernetes section above
Talos & Kubernetes versions are linked quite closely. If you're multiple versions behind on each:
- Upgrade them in lock step (i.e. upgrade Talos one minor version, then Kubernetes one minor version). If this isn't done, weird stuff can start to happen. (Trust me)
- Upgrade Talos's minor version first, then Kubernetes's minor version.
- Enable Talos logs to be sent to a logging endpoint, similar to this example.
I attempted for quite a while to avoid tedious manual declarations of IP addresses for each Kubernetes node. I found some level of success assigning MAC addresses to the VMs, reading the DHCP-assigned IPv4 addresses from the Unifi Router, then using that in the rest of the process. But ultimately, it was unsuccessful. The Unifi Router would begin to get confused with the introduction of virtual IPs, such as the Talos Virtual IP, and begin to return the Virtual IP when I needed the direct node IP. I had to scrap this idea unfortunately. Instead, we assign each node an IP address and MAC address in main.tf manually.