This document provides a comprehensive, step-by-step guide to deploying a production-ready Kubernetes cluster on Proxmox VE. It covers everything from initial VM setup to advanced topics like persistent storage, load balancing, certificate management, and observability.
This part covers the foundational steps of creating the virtual machines, preparing the operating system, and installing the core Kubernetes components.
Before deploying the virtual machines, configure DNS A
records for each node to ensure proper name resolution.
IP Address | Hostname |
---|---|
192.168.86.32 |
k8smaster.svhome.net |
192.168.86.37 |
k8sworker01.svhome.net |
192.168.86.26 |
k8sworker02.svhome.net |
Deploy three virtual machines (one master, two workers) using the configurations provided in the initial notes. Ensure the master has 4+ cores and 8GB+ RAM, and the workers have 8+ cores, 16GB+ RAM, and two separate virtual disks (e.g., 70G for the OS, 100G for storage).
Install Ubuntu Server 22.04 LTS (Jammy Jellyfish) on all three VMs. During installation, configure the static IPs from Step 1 and install the OpenSSH server.
Run the prepare-node.sh
script on all three nodes to update packages and install prerequisites like qemu-guest-agent
.
Run the disable-swap.sh
script on all nodes. Kubernetes requires swap to be disabled for performance and stability.
Run the configure-kernel.sh
script on all nodes to load the overlay
and br_netfilter
modules and set required sysctl
parameters for container networking.
Run the install-containerd.sh
script on all nodes to install and configure the containerd
runtime.
Run the install-kube-tools.sh
script on all nodes to install kubelet
, kubeadm
, and kubectl
and hold them at their current version.
Run the initialize-master.sh
script on the master node. This will set up the control plane and output a kubeadm join
command. Copy and save this command.
Run the join-worker.sh
script on both worker nodes, passing the saved kubeadm join
command as an argument to connect them to the cluster.
This part covers the installation of essential services for load balancing, storage, certificate management, and monitoring.
Run the install-metallb.sh
script to provide network load-balancer functionality for your bare-metal cluster, allowing you to create services of type LoadBalancer
.
First, run install-openebs.sh
to deploy the OpenEBS storage provider. Then, run create-cstor-pool.sh
to create a storage pool from the extra disks on your worker nodes and set up a default StorageClass
for persistent volume claims.
Run the install-metrics-server.sh
script to deploy the Kubernetes Metrics Server, which enables resource monitoring with commands like kubectl top node
.
Run the install-cert-manager.sh
script to install the certificate management controller and configure a basic self-signed issuer.
Run the install-ingress-nginx.sh
script to deploy the NGINX Ingress Controller, which will manage external access to your cluster's HTTP/S services.
For a production-grade PKI, install and configure Vault to act as a private Certificate Authority.
- Run
install-vault.sh
to deploy Vault. - Follow the manual steps in the guide to initialize, unseal, and configure the PKI engine inside the Vault pod. This is a critical, one-time manual step.
- Run
configure-vault-issuer.sh
to create aClusterIssuer
that connectscert-manager
to Vault. - Run
upgrade-vault-ingress.sh
to secure the Vault UI with a certificate from its own PKI. - Run
install-vault-autounseal.sh
to install the vault-autounseal helper and configure it to automatically unseal the Vault pod. - Run
run-inside-valut.commands
to initialize and configure the PKI engine inside the Vault pod.
This part covers the installation of web UIs and terminal tools to make managing the cluster easier.
- Run the
install-dashboard.sh
script to deploy the official Kubernetes Dashboard. - Run the
create-dashboard-admin.sh
script to create a service account with admin privileges and retrieve a login token.
- Run
install-k9s.sh
to install the popular terminal-based UI. - Run
install-helm-dashboard.sh
to deploy a web UI specifically for managing Helm releases. - Run
install-portainer.sh
to deploy an alternative, operator-friendly web UI.
This part covers setting up a complete observability stack to monitor the health and performance of your cluster and applications.
Run the install-signoz.sh
script to deploy SigNoz, the all-in-one backend for your metrics, traces, and logs.
Run the install-otel-collectors.sh
script to deploy OTel collectors as both a DaemonSet and a Deployment to gather telemetry data from across the cluster.
Run the install-otel-operator.sh
script to enable automatic instrumentation of your applications for distributed tracing.
Run the install-otel-demo.sh
script to deploy a sample microservices application. This will generate data you can explore in the SigNoz UI to verify your observability stack is working correctly.
This part provides a detailed guide on how to perform a rolling upgrade of your Kubernetes cluster, ensuring minimal downtime and a smooth transition to newer versions. This process involves upgrading components on the master node first, followed by each worker node sequentially.
Before starting the upgrade, ensure that the APT repositories for the target Kubernetes versions are configured on all nodes. This allows apt
to find the necessary packages.
Run the setup-k8s-repos.sh
script on all master and worker nodes.
Upgrade the control plane components on the master node. This process will iteratively upgrade kubeadm
, then apply the cluster upgrade, and finally upgrade kubelet
and kubectl
.
Run the upgrade-master-node.sh
script on your master node (k8smaster.svhome.net
).
Upgrade each worker node one by one to maintain cluster availability. For each worker node:
- Drain the Worker Node (Master Node): Remove all running pods from the worker node to prepare it for maintenance.
Run
manage-worker-drain-uncordon.sh drain k8sworker01.svhome.net
(replacek8sworker01.svhome.net
with the actual worker hostname). - Upgrade Worker Components (Worker Node): Install the new versions of
kubelet
andkubectl
on the drained worker node. SSH into the drained worker node (e.g.,k8sworker01.svhome.net
) and run theupgrade-worker-node.sh k8sworker01.svhome.net
script. - Uncordon the Worker Node (Master Node): Mark the worker node as schedulable again, allowing new pods to be placed on it.
Run
manage-worker-drain-uncordon.sh uncordon k8sworker01.svhome.net
(replacek8sworker01.svhome.net
with the actual worker hostname).
Repeat this process for each worker node in your cluster (e.g., k8sworker02.svhome.net
, k8sworker03.svhome.net
, etc.).