Skip to content

Development Cluster

TomΓ‘s Palma edited this page Jan 4, 2026 · 16 revisions

Development Cluster

We use two types of development clusters, the Digital Twin Development Cluster and the Kind cluster.

One of the cluster types uses Vagrant virtual machines to instantiate all the machines and test OS images and networking configurations.We call it the Digital twin developer cluster as it copies the actual physical setup of the cluster.

The second type, the Kind cluster implements a ready-to-go Kubernetes (K8s) cluster to test deployments and K8s-only configs.

Digital twin development cluster

The Development Cluster is a way to ease the development and test NIployments-revamp effectively. It uses Vagrant to standardize the deployment of VMs across platforms and providers.

How to run the cluster

  1. Install Vagrant
  2. Install a virtualization provider. You can use Virtualbox (Windows/Linux/macOS) or libvirt (Linux only, but more performant than Virtualbox).
  3. Edit dev-cluster.yaml to suit your pc configuration. You can adjust RAM for each node, the number of nodes, and networking (important if you are using libvirt)
  4. Run the cluster with vagrant up (take a coffee, it can take a while for the first time). In order to stop it, you can run vagrant halt

You can SSH to each node in the cluster, including the router, doing vagrant ssh cluster<n>, being n the node number (So if you have 3 nodes, you can access cluster1, cluster2 and cluster3). In order to access the router you can do vagrant ssh router.

In order to delete the cluster you can run vagrant destroy -f.

To run the deploy-playbook.yaml file you need jmespath:

pip install jmespath

Changing development cluster configuration

If you wish to change the default development cluster configuration, you must copy the dev-cluster.yaml into local-dev-cluster.yaml and make your changes there.

Network cluster topology

Vagrant requires that eth0 on the guest VM is configured for NAT for SSH (it uses SSH, for provisioning and ease of access to the VM itself).

network-dev-topology drawio

Right now, the router runs on Debian 10, and the nodes run on Ubuntu 22.04 (but temporarily until we decide a suitable OS for the cluster)

Development

All development and customization to the development cluster is done on the Vagrantfile in Ruby

Kind development cluster

Caution

There is a script, setup-kind-cluster.sh, stored under dev, which automates this whole process. Follow the instructions bellow if you wish to understand each step.

MacOS

Important

You'll need docker-mac-net-connect installed and running in order to access the exposed services from your local machine.

You need to have the following packages installed:

  • kind
  • helm
  • docker
  • cilium
  • kubectl

Run the dev script

./dev/setup-kind-cluster.sh

Important

The config file allows the kubelet (running containerd) inside the containerised kind nodes to pull from registries with unverified TLS certificates. Check out more in this StackOverflow post.

Install MetalLB

Despite not being used in the actual cluster, we use MetalLB to ease the networking config when using the Kind development cluster, making services accessible to the local machine and the container nodes running the K8s nodes.

Warning

More specifically, skipping this step will cause the services to not have an assigned IP address, being unavailable from your computer.

kubectl apply -f https://github.com/metallb/metallb/raw/main/config/manifests/metallb-native.yaml

kubectl wait --namespace metallb-system \
                --for=condition=ready pod \
                --selector=app=metallb \
                --timeout=120s

echo "\
apiVersion: metallb.io/v1beta1
kind: IPAddressPool
metadata:
  name: example
  namespace: metallb-system
spec:
  addresses:
  - 172.28.255.200-172.28.255.250
---
apiVersion: metallb.io/v1beta1
kind: L2Advertisement
metadata:
  name: empty
  namespace: metallb-system" | \
kubectl apply -f -

And from here, you should be ready to go and install your services. You might find it interesting to add an Image Registry; check the wiki page about that.

Troubleshooting

This section will contain known problems some people have running the dev cluster and their solutions, in case you may need them.

1. docker exec --privileged niployments-test-cluster-worker kubeadm join --config /kind/kubeadm.conf

The error that appears is the following:

ERROR: failed to create cluster: failed to join node with kubeadm: command 
"docker exec --privileged niployments-test-cluster-worker kubeadm join --config /kind/kubeadm.conf --skip-phases=preflight --v=6" 
failed with error: exit status 1

This may be because of the node not being able to start CAdvisor. Try the following commands to see if it gets fixed:

sudo sysctl fs.inotify.max_user_watches=524288
sudo sysctl fs.inotify.max_user_instances=512

If that works, consider permanently making this change by writing it to /etc/systcl.conf of your PC.

Clone this wiki locally