This guide walks through setting up a single-node Kubernetes cluster on Debian 13 (trixie) using kubeadm, containerd, and Cilium CNI. The setup is purpose-built for running GPU workloads on a single physical server -- no cloud, no multi-node complexity.
| Component | Version / Detail |
|---|---|
| OS | Debian 13 (trixie) |
| Kubernetes | v1.35.0 |
| kubectl | v1.33.5 |
| containerd | 1.7.28 |
| CNI | Cilium 1.18.5 |
| Node name | zosmaai |
This setup is based on max-pfeiffer's blog guide for kubeadm on Debian, with several custom tweaks for single-node GPU workloads -- most notably keeping swap enabled and replacing kube-proxy with Cilium.
A Debian 13 (trixie) server with:
- Root or sudo access
- A static IP or stable DHCP lease
- Internet access for pulling packages and container images
Kubernetes needs a container runtime. We use containerd 1.7.28, installed from the Docker apt repository.
# Add Docker's official GPG key and repository
sudo apt-get update
sudo apt-get install -y ca-certificates curl gnupg
sudo install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/debian/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
sudo chmod a+r /etc/apt/keyrings/docker.gpg
echo \
"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/debian \
$(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get update
sudo apt-get install -y containerd.ioGenerate the default config and enable SystemdCgroup:
sudo mkdir -p /etc/containerd
containerd config default | sudo tee /etc/containerd/config.tomlEdit /etc/containerd/config.toml to set SystemdCgroup = true under the runc options. This is required for kubeadm to work correctly with systemd as the init system. The full config is stored in this repository at system/containerd/config.toml.
sudo systemctl restart containerd
sudo systemctl enable containerdKubernetes networking requires specific kernel modules and sysctl parameters.
cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
overlay
br_netfilter
EOF
sudo modprobe overlay
sudo modprobe br_netfilterCreate /etc/sysctl.d/k8s.conf to enable IP forwarding and bridge netfilter:
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1Apply immediately:
sudo sysctl --systemThe actual config file is stored at system/sysctl.d/k8s.conf.
Add the Kubernetes apt repository and install the components:
sudo apt-get install -y apt-transport-https
curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.35/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg
echo 'deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.35/deb/ /' | \
sudo tee /etc/apt/sources.list.d/kubernetes.list
sudo apt-get update
sudo apt-get install -y kubelet kubeadm kubectl
sudo apt-mark hold kubelet kubeadm kubectlMost Kubernetes guides tell you to disable swap. We do not. This server has only 16 GB of physical RAM but runs large model training workloads that can spike memory usage unpredictably. A 32 GB btrfs swapfile acts as a safety net.
Kubernetes has supported swap since v1.28 (beta). We pass --ignore-preflight-errors=Swap to kubeadm and let the kubelet handle it. Later, in Known Issues, we discuss tuning vm.swappiness to prevent swap thrashing from freezing the system.
We skip the default kube-proxy installation because Cilium will replace it entirely:
sudo kubeadm init \
--node-name zosmaai \
--skip-phases=addon/kube-proxy \
--ignore-preflight-errors=SwapThe --skip-phases=addon/kube-proxy flag is critical. Cilium operates as a full kube-proxy replacement using eBPF, and having both running causes routing conflicts.
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/configBy default, kubeadm taints the control-plane node so that no workload pods can be scheduled on it. On a multi-node cluster this makes sense -- you want the control plane dedicated to cluster management. On a single-node cluster, this taint means nothing can run at all.
Remove it:
kubectl taint nodes zosmaai node-role.kubernetes.io/control-plane:NoSchedule-Verify the taint is gone:
kubectl describe node zosmaai | grep -i taint
# Should show: Taints: <none>Cilium is a high-performance CNI that uses eBPF for networking, observability, and security. We use it as a complete replacement for kube-proxy.
CILIUM_CLI_VERSION=$(curl -s https://raw.githubusercontent.com/cilium/cilium-cli/main/stable.txt)
CLI_ARCH=amd64
curl -L --fail --remote-name-all \
https://github.com/cilium/cilium-cli/releases/download/${CILIUM_CLI_VERSION}/cilium-linux-${CLI_ARCH}.tar.gz
sudo tar xzvfC cilium-linux-${CLI_ARCH}.tar.gz /usr/local/bin
rm cilium-linux-${CLI_ARCH}.tar.gzhelm repo add cilium https://helm.cilium.io/
helm repo update
helm install cilium cilium/cilium --version 1.18.5 \
--namespace kube-system \
--set kubeProxyReplacement=true \
--set k8sServiceHost=<API_SERVER_IP> \
--set k8sServicePort=6443The kubeProxyReplacement=true flag tells Cilium to handle all service routing via eBPF, replacing kube-proxy entirely.
Cilium requires rp_filter (reverse path filtering) to be disabled on its interfaces. Without this, the kernel drops packets that Cilium legitimately routes through its virtual interfaces.
Create /etc/sysctl.d/99-zzz-override_cilium.conf:
# Disable rp_filter on Cilium interfaces since it may cause mangled packets to be dropped
-net.ipv4.conf.lxc*.rp_filter = 0
-net.ipv4.conf.cilium_*.rp_filter = 0
# The kernel uses max(conf.all, conf.{dev}) as its value, so we need to set .all. to 0 as well.
# Otherwise it will overrule the device specific settings.
net.ipv4.conf.all.rp_filter = 0The 99-zzz- prefix ensures this file is loaded last, overriding any earlier sysctl configs. The actual config is at system/sysctl.d/99-zzz-override_cilium.conf.
sudo sysctl --systemcilium statusYou should see all components reporting OK. You can also run the connectivity test:
cilium connectivity testAt this point, the single-node cluster should be fully operational:
kubectl get nodesExpected output:
NAME STATUS ROLES AGE VERSION
zosmaai Ready control-plane ... v1.35.0
Check that all system pods are running:
kubectl get pods -n kube-systemYou should see the Cilium agent, Cilium operator, CoreDNS, etcd, kube-apiserver, kube-controller-manager, and kube-scheduler all in Running state. There should be no kube-proxy pod since we skipped that phase.
All system configuration files are stored in the system/ directory:
| File | Purpose |
|---|---|
system/sysctl.d/k8s.conf |
IP forwarding and bridge netfilter for Kubernetes |
system/sysctl.d/99-zzz-override_cilium.conf |
Disable rp_filter for Cilium |
system/containerd/config.toml |
containerd configuration with SystemdCgroup and NVIDIA runtime |
Why single-node? This is a learning lab. Single-node eliminates networking complexity, storage replication, and node scheduling concerns. It lets us focus on the GPU workload side. Multi-node is on the roadmap.
Why Cilium over kube-proxy? eBPF-based networking is more efficient and provides better observability. For GPU workloads where we want low overhead, Cilium is a good fit. It also means one fewer component (kube-proxy) to manage.
Why keep swap? With 16 GB RAM and models that can spike to 12+ GB during loading, swap is a safety net. The tradeoff is that swap thrashing can freeze the system (see Known Issues), but with vm.swappiness=1, the kernel prefers OOM-killing over thrashing.
Why kubeadm over k3s or microk8s? kubeadm produces a standard upstream Kubernetes cluster. What you learn here transfers directly to production environments. k3s and microk8s are excellent tools, but they abstract away details that are valuable to understand.
With the cluster running, the next step is NVIDIA GPU Setup to make the GPUs available to Kubernetes workloads.