|
| 1 | +# Upgrade Guide |
| 2 | + |
| 3 | +This document provides guidance for upgrading Terraform Turing Pi modules and the underlying Kubernetes components. |
| 4 | + |
| 5 | +## Table of Contents |
| 6 | + |
| 7 | +- [Module Version Upgrades](#module-version-upgrades) |
| 8 | +- [K3s Version Upgrades](#k3s-version-upgrades) |
| 9 | +- [Addon Upgrades](#addon-upgrades) |
| 10 | +- [Breaking Changes](#breaking-changes) |
| 11 | + |
| 12 | +## Module Version Upgrades |
| 13 | + |
| 14 | +### Upgrading Module References |
| 15 | + |
| 16 | +When upgrading to a new module version, update the `ref` tag in your module source: |
| 17 | + |
| 18 | +```hcl |
| 19 | +# Before |
| 20 | +module "k3s_cluster" { |
| 21 | + source = "github.com/jfreed-dev/terraform-turingpi-modules//modules/k3s-cluster?ref=v1.3.3" |
| 22 | + # ... |
| 23 | +} |
| 24 | +
|
| 25 | +# After |
| 26 | +module "k3s_cluster" { |
| 27 | + source = "github.com/jfreed-dev/terraform-turingpi-modules//modules/k3s-cluster?ref=v1.3.5" |
| 28 | + # ... |
| 29 | +} |
| 30 | +``` |
| 31 | + |
| 32 | +### Steps for Module Upgrade |
| 33 | + |
| 34 | +1. **Review the changelog** for breaking changes |
| 35 | +2. **Update module source** to new version |
| 36 | +3. **Run `terraform init -upgrade`** to fetch new module version |
| 37 | +4. **Run `terraform plan`** to preview changes |
| 38 | +5. **Apply changes** with `terraform apply` |
| 39 | + |
| 40 | +```bash |
| 41 | +terraform init -upgrade |
| 42 | +terraform plan |
| 43 | +terraform apply |
| 44 | +``` |
| 45 | + |
| 46 | +## K3s Version Upgrades |
| 47 | + |
| 48 | +### Automatic K3s Upgrades |
| 49 | + |
| 50 | +The k3s-cluster module supports automatic version upgrades via the `k3s_version` variable: |
| 51 | + |
| 52 | +```hcl |
| 53 | +module "k3s_cluster" { |
| 54 | + source = "github.com/jfreed-dev/terraform-turingpi-modules//modules/k3s-cluster?ref=v1.3.5" |
| 55 | +
|
| 56 | + k3s_version = "v1.31.4+k3s1" # Update to new version |
| 57 | + # ... |
| 58 | +} |
| 59 | +``` |
| 60 | + |
| 61 | +### Manual K3s Upgrade Process |
| 62 | + |
| 63 | +For more control over the upgrade process: |
| 64 | + |
| 65 | +1. **Backup etcd/cluster state** (if applicable) |
| 66 | +2. **Drain control plane nodes** (optional but recommended) |
| 67 | +3. **Upgrade control plane first** |
| 68 | +4. **Upgrade worker nodes one at a time** |
| 69 | +5. **Verify cluster health** |
| 70 | + |
| 71 | +```bash |
| 72 | +# On control plane |
| 73 | +curl -sfL https://get.k3s.io | INSTALL_K3S_VERSION=v1.31.4+k3s1 sh -s - server |
| 74 | + |
| 75 | +# On each worker |
| 76 | +curl -sfL https://get.k3s.io | INSTALL_K3S_VERSION=v1.31.4+k3s1 K3S_URL=https://<server>:6443 K3S_TOKEN=<token> sh -s - agent |
| 77 | +``` |
| 78 | + |
| 79 | +### Version Compatibility Matrix |
| 80 | + |
| 81 | +| K3s Version | Kubernetes Version | Notes | |
| 82 | +|-------------|-------------------|-------| |
| 83 | +| v1.31.x | 1.31.x | Current stable | |
| 84 | +| v1.30.x | 1.30.x | Previous stable | |
| 85 | +| v1.29.x | 1.29.x | Maintenance | |
| 86 | + |
| 87 | +## Addon Upgrades |
| 88 | + |
| 89 | +### MetalLB |
| 90 | + |
| 91 | +```hcl |
| 92 | +module "metallb" { |
| 93 | + source = "github.com/jfreed-dev/terraform-turingpi-modules//modules/addons/metallb?ref=v1.3.5" |
| 94 | +
|
| 95 | + chart_version = "0.14.9" # Update chart version |
| 96 | + ip_range = "10.10.88.80-10.10.88.89" |
| 97 | +} |
| 98 | +``` |
| 99 | + |
| 100 | +**Upgrade considerations:** |
| 101 | +- MetalLB upgrades are generally non-disruptive |
| 102 | +- CRD updates may be required for major versions |
| 103 | +- Test in staging environment first |
| 104 | + |
| 105 | +### Ingress-NGINX |
| 106 | + |
| 107 | +```hcl |
| 108 | +module "ingress_nginx" { |
| 109 | + source = "github.com/jfreed-dev/terraform-turingpi-modules//modules/addons/ingress-nginx?ref=v1.3.5" |
| 110 | +
|
| 111 | + chart_version = "4.11.3" # Update chart version |
| 112 | +} |
| 113 | +``` |
| 114 | + |
| 115 | +**Upgrade considerations:** |
| 116 | +- May cause brief service interruption during controller pod restart |
| 117 | +- Review ingress class changes between versions |
| 118 | +- Test TLS certificate handling after upgrade |
| 119 | + |
| 120 | +### Longhorn |
| 121 | + |
| 122 | +```hcl |
| 123 | +module "longhorn" { |
| 124 | + source = "github.com/jfreed-dev/terraform-turingpi-modules//modules/addons/longhorn?ref=v1.3.5" |
| 125 | +
|
| 126 | + chart_version = "1.7.2" # Update chart version |
| 127 | +} |
| 128 | +``` |
| 129 | + |
| 130 | +**Upgrade considerations:** |
| 131 | +- **Always backup data before upgrading** |
| 132 | +- Check engine compatibility matrix |
| 133 | +- Upgrade manager first, then engines |
| 134 | +- Allow time for volume migrations |
| 135 | + |
| 136 | +### Monitoring (kube-prometheus-stack) |
| 137 | + |
| 138 | +```hcl |
| 139 | +module "monitoring" { |
| 140 | + source = "github.com/jfreed-dev/terraform-turingpi-modules//modules/addons/monitoring?ref=v1.3.5" |
| 141 | +
|
| 142 | + chart_version = "65.8.1" # Update chart version |
| 143 | + grafana_admin_password = var.grafana_password |
| 144 | +} |
| 145 | +``` |
| 146 | + |
| 147 | +**Upgrade considerations:** |
| 148 | +- Prometheus data retention during upgrade |
| 149 | +- Grafana dashboard compatibility |
| 150 | +- Alert rule migrations |
| 151 | + |
| 152 | +### cert-manager |
| 153 | + |
| 154 | +```hcl |
| 155 | +module "cert_manager" { |
| 156 | + source = "github.com/jfreed-dev/terraform-turingpi-modules//modules/addons/cert-manager?ref=v1.3.5" |
| 157 | +
|
| 158 | + chart_version = "1.16.2" # Update chart version |
| 159 | +} |
| 160 | +``` |
| 161 | + |
| 162 | +**Upgrade considerations:** |
| 163 | +- CRD updates may be required |
| 164 | +- Certificate renewal processes continue during upgrade |
| 165 | +- Test ACME challenges after upgrade |
| 166 | + |
| 167 | +## Breaking Changes |
| 168 | + |
| 169 | +### v1.3.5 |
| 170 | + |
| 171 | +**New features:** |
| 172 | +- Added `namespace` variable to all addon modules |
| 173 | +- Added `controller_resources` and `speaker_resources` to MetalLB |
| 174 | +- Added `controller_replicas` and resource configuration to ingress-nginx |
| 175 | +- Added `manager_resources` and `ui_replicas` to Longhorn |
| 176 | +- Added Grafana password validation (minimum 8 characters) |
| 177 | +- New cert-manager addon module |
| 178 | + |
| 179 | +**Migration steps:** |
| 180 | +1. If you were using default namespaces, no changes required |
| 181 | +2. To use custom namespaces, add the `namespace` variable: |
| 182 | + |
| 183 | +```hcl |
| 184 | +module "metallb" { |
| 185 | + source = "..." |
| 186 | +
|
| 187 | + namespace = "custom-metallb-namespace" # New optional parameter |
| 188 | + ip_range = "10.10.88.80-10.10.88.89" |
| 189 | +} |
| 190 | +``` |
| 191 | + |
| 192 | +3. For monitoring module, ensure Grafana password is at least 8 characters: |
| 193 | + |
| 194 | +```hcl |
| 195 | +module "monitoring" { |
| 196 | + source = "..." |
| 197 | +
|
| 198 | + grafana_admin_password = "secure-password-here" # Must be >= 8 chars |
| 199 | +} |
| 200 | +``` |
| 201 | + |
| 202 | +### v1.3.4 |
| 203 | + |
| 204 | +**Changes:** |
| 205 | +- Synchronized release with terraform-provider-turingpi v1.3.4 |
| 206 | +- Provider now supports BMC firmware 2.3.4 API response format |
| 207 | + |
| 208 | +### v1.3.3 |
| 209 | + |
| 210 | +**Changes:** |
| 211 | +- Added CODE_OF_CONDUCT.md |
| 212 | +- Added docs/ARCHITECTURE.md |
| 213 | +- Added security workflow with Trivy scanning |
| 214 | +- Enhanced SECURITY.md, CODEOWNERS, CONTRIBUTING.md |
| 215 | + |
| 216 | +## Pre-Upgrade Checklist |
| 217 | + |
| 218 | +Before upgrading any component: |
| 219 | + |
| 220 | +- [ ] Review changelog and breaking changes |
| 221 | +- [ ] Backup critical data (etcd, PVs, configurations) |
| 222 | +- [ ] Test upgrade in staging environment |
| 223 | +- [ ] Plan maintenance window if needed |
| 224 | +- [ ] Notify users of potential downtime |
| 225 | +- [ ] Verify rollback procedure |
| 226 | + |
| 227 | +## Post-Upgrade Verification |
| 228 | + |
| 229 | +After upgrading: |
| 230 | + |
| 231 | +```bash |
| 232 | +# Check all nodes are ready |
| 233 | +kubectl get nodes |
| 234 | + |
| 235 | +# Check all pods are running |
| 236 | +kubectl get pods -A |
| 237 | + |
| 238 | +# Check addon-specific health |
| 239 | +kubectl get pods -n metallb-system |
| 240 | +kubectl get pods -n ingress-nginx |
| 241 | +kubectl get pods -n longhorn-system |
| 242 | +kubectl get pods -n monitoring |
| 243 | +kubectl get pods -n cert-manager |
| 244 | + |
| 245 | +# Verify services have external IPs |
| 246 | +kubectl get svc -A | grep LoadBalancer |
| 247 | + |
| 248 | +# Check certificates (if using cert-manager) |
| 249 | +kubectl get certificates -A |
| 250 | +kubectl get clusterissuers |
| 251 | +``` |
| 252 | + |
| 253 | +## Rollback Procedures |
| 254 | + |
| 255 | +### Module Rollback |
| 256 | + |
| 257 | +Revert to previous module version: |
| 258 | + |
| 259 | +```hcl |
| 260 | +module "k3s_cluster" { |
| 261 | + source = "github.com/jfreed-dev/terraform-turingpi-modules//modules/k3s-cluster?ref=v1.3.3" # Previous version |
| 262 | +} |
| 263 | +``` |
| 264 | + |
| 265 | +```bash |
| 266 | +terraform init -upgrade |
| 267 | +terraform apply |
| 268 | +``` |
| 269 | + |
| 270 | +### Helm Release Rollback |
| 271 | + |
| 272 | +For addon rollbacks: |
| 273 | + |
| 274 | +```bash |
| 275 | +# List release history |
| 276 | +helm history <release-name> -n <namespace> |
| 277 | + |
| 278 | +# Rollback to previous revision |
| 279 | +helm rollback <release-name> <revision> -n <namespace> |
| 280 | +``` |
| 281 | + |
| 282 | +## Getting Help |
| 283 | + |
| 284 | +If you encounter issues during upgrade: |
| 285 | + |
| 286 | +1. Check the [GitHub Issues](https://github.com/jfreed-dev/terraform-turingpi-modules/issues) |
| 287 | +2. Review Terraform and Helm logs |
| 288 | +3. Open a new issue with upgrade details and error messages |
0 commit comments