|
| 1 | +# Keepalived Load Balancer for KubeV Control Plane |
| 2 | + |
| 3 | +This Terraform module deploys keepalived on KubeV control plane nodes to provide a Virtual IP (VIP) for Kubernetes API high availability. |
| 4 | + |
| 5 | +## Overview |
| 6 | + |
| 7 | +Keepalived uses VRRP (Virtual Router Redundancy Protocol) to manage a floating Virtual IP address across multiple control plane nodes. When the master node fails, the VIP automatically moves to a backup node, ensuring continuous API server availability. |
| 8 | + |
| 9 | +## Use Cases |
| 10 | + |
| 11 | +This module is recommended for: |
| 12 | +- **On-premise / bare-metal deployments** where cloud load balancers are not available |
| 13 | +- **VMware vSphere** environments |
| 14 | +- **Private data centers** with direct L2 network connectivity |
| 15 | + |
| 16 | +> **Note**: For GCP test deployments, use the [GCP Internal Load Balancer](../../infra-machines/gce/README.md) instead, as keepalived VIPs are not easily accessible from outside the VPC and GCP does not support multicast. |
| 17 | +
|
| 18 | +## Prerequisites |
| 19 | + |
| 20 | +- Control plane nodes must be provisioned and accessible via SSH |
| 21 | +- All control plane nodes must be on the same L2 network segment |
| 22 | +- The VIP address must be unused and within the same subnet as control plane nodes |
| 23 | +- SSH key-based authentication configured |
| 24 | + |
| 25 | +## Usage |
| 26 | + |
| 27 | +### 1. Create terraform.tfvars |
| 28 | + |
| 29 | +```hcl |
| 30 | +cluster_name = "kubev-cluster" |
| 31 | +
|
| 32 | +# Virtual IP for the Kubernetes API |
| 33 | +api_vip = "10.0.2.100" |
| 34 | +
|
| 35 | +# Control plane node IPs |
| 36 | +control_plane_hosts = [ |
| 37 | + "10.0.2.10", |
| 38 | + "10.0.2.11", |
| 39 | + "10.0.2.12" |
| 40 | +] |
| 41 | +
|
| 42 | +# SSH configuration |
| 43 | +ssh_username = "ubuntu" |
| 44 | +ssh_private_key_file = "~/.ssh/id_rsa" |
| 45 | +
|
| 46 | +# Network interface for VRRP (check with: ip link show) |
| 47 | +vrrp_interface = "ens192" |
| 48 | +
|
| 49 | +# VRRP router ID (must be unique in your network, 1-255) |
| 50 | +vrrp_router_id = 42 |
| 51 | +
|
| 52 | +# Optional: Bastion host for SSH jump |
| 53 | +# bastion_host = "bastion.example.com" |
| 54 | +# bastion_port = 22 |
| 55 | +# bastion_username = "ubuntu" |
| 56 | +``` |
| 57 | + |
| 58 | +### 2. Initialize and Apply |
| 59 | + |
| 60 | +```bash |
| 61 | +terraform init |
| 62 | +terraform plan |
| 63 | +terraform apply |
| 64 | +``` |
| 65 | + |
| 66 | +### 3. Verify Installation |
| 67 | + |
| 68 | +SSH into any control plane node and check: |
| 69 | + |
| 70 | +```bash |
| 71 | +# Check keepalived status |
| 72 | +sudo systemctl status keepalived |
| 73 | + |
| 74 | +# Check which node holds the VIP |
| 75 | +ip addr show | grep <VIP> |
| 76 | + |
| 77 | +# Test API server via VIP |
| 78 | +curl -k https://<VIP>:6443/healthz |
| 79 | +``` |
| 80 | + |
| 81 | +## Configuration Variables |
| 82 | + |
| 83 | +| Variable | Description | Default | Required | |
| 84 | +|----------|-------------|---------|:--------:| |
| 85 | +| `cluster_name` | Name of the cluster | - | yes | |
| 86 | +| `api_vip` | Virtual IP address for Kubernetes API | - | yes | |
| 87 | +| `control_plane_hosts` | List of control plane node IPs | - | yes | |
| 88 | +| `ssh_username` | SSH user for provisioning | `root` | no | |
| 89 | +| `ssh_private_key_file` | Path to SSH private key | - | yes | |
| 90 | +| `vrrp_interface` | Network interface for VRRP | `ens192` | no | |
| 91 | +| `vrrp_router_id` | VRRP router ID (1-255, must be unique) | `42` | no | |
| 92 | +| `bastion_host` | SSH bastion/jump host | `""` | no | |
| 93 | +| `bastion_port` | SSH bastion port | `22` | no | |
| 94 | +| `bastion_username` | SSH bastion user | `""` | no | |
| 95 | + |
| 96 | +## Outputs |
| 97 | + |
| 98 | +| Output | Description | |
| 99 | +|--------|-------------| |
| 100 | +| `kubev_api` | API endpoint configuration with VIP | |
| 101 | +| `kubev_hosts` | Control plane host information | |
| 102 | + |
| 103 | +## How It Works |
| 104 | + |
| 105 | +1. **Installation**: The module installs keepalived on all control plane nodes |
| 106 | +2. **Configuration**: Each node is configured with VRRP: |
| 107 | + - First node becomes MASTER (priority 101) |
| 108 | + - Other nodes become BACKUP (priority 100) |
| 109 | +3. **Health Check**: A script checks the local API server every 3 seconds |
| 110 | +4. **Failover**: If the master fails the health check, the VIP moves to a backup node |
| 111 | + |
| 112 | +## Architecture |
| 113 | + |
| 114 | +``` |
| 115 | + ┌─────────────────┐ |
| 116 | + │ VIP: api_vip │ |
| 117 | + │ (Floating IP) │ |
| 118 | + └────────┬────────┘ |
| 119 | + │ |
| 120 | + ┌─────────────────┼─────────────────┐ |
| 121 | + │ │ │ |
| 122 | + ▼ ▼ ▼ |
| 123 | + ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ |
| 124 | + │ CP Node 1 │ │ CP Node 2 │ │ CP Node 3 │ |
| 125 | + │ (MASTER) │ │ (BACKUP) │ │ (BACKUP) │ |
| 126 | + │ Priority:101│ │ Priority:100│ │ Priority:100│ |
| 127 | + │ kube-api:6443│ │ kube-api:6443│ │ kube-api:6443│ |
| 128 | + └──────────────┘ └──────────────┘ └──────────────┘ |
| 129 | +``` |
| 130 | + |
| 131 | +## Troubleshooting |
| 132 | + |
| 133 | +### Check keepalived logs |
| 134 | +```bash |
| 135 | +sudo journalctl -u keepalived -f |
| 136 | +``` |
| 137 | + |
| 138 | +### Verify VRRP communication |
| 139 | +```bash |
| 140 | +sudo tcpdump -i <interface> vrrp |
| 141 | +``` |
| 142 | + |
| 143 | +### Manual failover test |
| 144 | +```bash |
| 145 | +# On master node, stop keepalived |
| 146 | +sudo systemctl stop keepalived |
| 147 | + |
| 148 | +# Check VIP moved to another node |
| 149 | +ip addr show | grep <VIP> |
| 150 | +``` |
| 151 | + |
| 152 | +### Common issues |
| 153 | + |
| 154 | +1. **VIP not assigned**: Check firewall allows VRRP protocol (IP protocol 112) |
| 155 | +2. **Split-brain**: Ensure all nodes can communicate via VRRP |
| 156 | +3. **Health check failing**: Verify API server is running on localhost:6443 |
| 157 | + |
| 158 | +## Integration with KubeV |
| 159 | + |
| 160 | +When using this module with KubeV, configure the cluster to use the VIP as the API endpoint: |
| 161 | + |
| 162 | +```yaml |
| 163 | +# kubev.yaml |
| 164 | +apiVersion: kubermatic.k8c.io/v1 |
| 165 | +kind: KubevCluster |
| 166 | +spec: |
| 167 | + controlPlaneEndpoint: |
| 168 | + host: "<api_vip>" |
| 169 | + port: 6443 |
| 170 | +``` |
0 commit comments