|
| 1 | +# Kepler Helm Chart Updates and Rolling Deployments |
| 2 | + |
| 3 | +This guide covers how to manage updates and rolling deployments for Kepler using Helm charts published to the OCI registry. |
| 4 | + |
| 5 | +## Chart Repository |
| 6 | + |
| 7 | +Kepler Helm charts are published to: |
| 8 | + |
| 9 | +- **OCI Registry**: `oci://quay.io/sustainable_computing_io/charts/kepler` |
| 10 | + |
| 11 | +## Installation Methods |
| 12 | + |
| 13 | +### Direct OCI Installation (Recommended) |
| 14 | + |
| 15 | +OCI registries cannot be added as traditional Helm repositories, so use direct installation: |
| 16 | + |
| 17 | +```bash |
| 18 | +# Install specific version |
| 19 | +helm install kepler oci://quay.io/sustainable_computing_io/charts/kepler \ |
| 20 | + --version 0.11.1 \ |
| 21 | + --namespace kepler \ |
| 22 | + --create-namespace |
| 23 | + |
| 24 | +# Install latest version (omit --version) |
| 25 | +helm install kepler oci://quay.io/sustainable_computing_io/charts/kepler \ |
| 26 | + --namespace kepler \ |
| 27 | + --create-namespace |
| 28 | +``` |
| 29 | + |
| 30 | +## Manual Updates |
| 31 | + |
| 32 | +### Check for New Versions |
| 33 | + |
| 34 | +Since OCI registries don't support traditional repository browsing, check for new versions using these methods: |
| 35 | + |
| 36 | +```bash |
| 37 | +# Show chart information for specific version |
| 38 | +helm show chart oci://quay.io/sustainable_computing_io/charts/kepler --version 0.11.2 |
| 39 | + |
| 40 | +# Check quay.io web interface for available tags |
| 41 | +# Visit: https://quay.io/repository/sustainable_computing_io/charts?tab=tags |
| 42 | + |
| 43 | +# Or use helm pull to check if version exists |
| 44 | +helm pull oci://quay.io/sustainable_computing_io/charts/kepler --version 0.11.2 --dry-run |
| 45 | +``` |
| 46 | + |
| 47 | +### Upgrade to Specific Version |
| 48 | + |
| 49 | +```bash |
| 50 | +# Upgrade to specific version |
| 51 | +helm upgrade kepler oci://quay.io/sustainable_computing_io/charts/kepler --version 0.11.2 --namespace kepler |
| 52 | + |
| 53 | +# Upgrade with custom values |
| 54 | +helm upgrade kepler oci://quay.io/sustainable_computing_io/charts/kepler --version 0.11.2 --namespace kepler --values values.yaml |
| 55 | + |
| 56 | +# Upgrade and wait for rollout to complete |
| 57 | +helm upgrade kepler oci://quay.io/sustainable_computing_io/charts/kepler --version 0.11.2 --namespace kepler --wait --timeout=300s |
| 58 | +``` |
| 59 | + |
| 60 | +### Upgrade to Latest Version |
| 61 | + |
| 62 | +```bash |
| 63 | +# Upgrade to latest (omit --version) |
| 64 | +helm upgrade kepler oci://quay.io/sustainable_computing_io/charts/kepler --namespace kepler |
| 65 | + |
| 66 | +# Verify upgrade |
| 67 | +helm status kepler --namespace kepler |
| 68 | +``` |
| 69 | + |
| 70 | +### Rollback if Needed |
| 71 | + |
| 72 | +```bash |
| 73 | +# List release history |
| 74 | +helm history kepler --namespace kepler |
| 75 | + |
| 76 | +# Rollback to previous version |
| 77 | +helm rollback kepler --namespace kepler |
| 78 | + |
| 79 | +# Rollback to specific revision |
| 80 | +helm rollback kepler 2 --namespace kepler |
| 81 | +``` |
| 82 | + |
| 83 | +## Update Strategies |
| 84 | + |
| 85 | +### Conservative Updates (Recommended for Production) |
| 86 | + |
| 87 | +Pin to specific patch versions and test before upgrading: |
| 88 | + |
| 89 | +```bash |
| 90 | +# Pin to specific version in production |
| 91 | +helm upgrade kepler oci://quay.io/sustainable_computing_io/charts/kepler --version 0.11.1 --namespace kepler |
| 92 | + |
| 93 | +# Test new version in staging first |
| 94 | +helm install kepler-staging oci://quay.io/sustainable_computing_io/charts/kepler --version 0.11.2 --namespace kepler-staging |
| 95 | + |
| 96 | +# After validation, upgrade production |
| 97 | +helm upgrade kepler oci://quay.io/sustainable_computing_io/charts/kepler --version 0.11.2 --namespace kepler |
| 98 | +``` |
| 99 | + |
| 100 | +## Monitoring Updates |
| 101 | + |
| 102 | +### Check Update Status |
| 103 | + |
| 104 | +```bash |
| 105 | +# Watch deployment progress |
| 106 | +kubectl rollout status daemonset/kepler -n kepler |
| 107 | + |
| 108 | +# Check pod status |
| 109 | +kubectl get pods -n kepler -w |
| 110 | + |
| 111 | +# View recent events |
| 112 | +kubectl get events -n kepler --sort-by='.lastTimestamp' |
| 113 | +``` |
| 114 | + |
| 115 | +### Verify Metrics After Update |
| 116 | + |
| 117 | +```bash |
| 118 | +# Port forward to access metrics |
| 119 | +kubectl port-forward -n kepler svc/kepler 28282:28282 |
| 120 | + |
| 121 | +# Test metrics endpoint |
| 122 | +curl http://localhost:28282/metrics | grep kepler_build_info |
| 123 | + |
| 124 | +# Check for expected metrics |
| 125 | +curl -s http://localhost:28282/metrics | grep -E "(kepler_node_cpu_watts|kepler_container_cpu_watts)" |
| 126 | +``` |
| 127 | + |
| 128 | +## Troubleshooting Updates |
| 129 | + |
| 130 | +### Failed Updates |
| 131 | + |
| 132 | +```bash |
| 133 | +# Check release status |
| 134 | +helm status kepler -n kepler |
| 135 | + |
| 136 | +# View release history |
| 137 | +helm history kepler -n kepler |
| 138 | + |
| 139 | +# Check for pending pods |
| 140 | +kubectl get pods -n kepler | grep -E "(Pending|ContainerCreating|CrashLoopBackOff)" |
| 141 | + |
| 142 | +# View pod logs |
| 143 | +kubectl logs -n kepler -l app.kubernetes.io/name=kepler --tail=100 |
| 144 | +``` |
| 145 | + |
| 146 | +### Recovery Procedures |
| 147 | + |
| 148 | +```bash |
| 149 | +# Rollback to previous working version |
| 150 | +helm rollback kepler -n kepler |
| 151 | + |
| 152 | +# Force recreation of DaemonSet if stuck |
| 153 | +kubectl delete daemonset kepler -n kepler |
| 154 | +helm upgrade kepler kepler/kepler --version 0.11.1 -n kepler |
| 155 | + |
| 156 | +# Emergency: use source charts if OCI registry is unavailable |
| 157 | +helm upgrade kepler manifests/helm/kepler/ -n kepler |
| 158 | +``` |
| 159 | + |
| 160 | +## Version Compatibility |
| 161 | + |
| 162 | +| Kepler Version | Kubernetes Version | Helm Version | Notes | |
| 163 | +|----------------|--------------------|--------------|----------------------| |
| 164 | +| 0.11.x | 1.20+ | 3.8+ | OCI registry support | |
| 165 | +| 0.10.x | 1.19+ | 3.0+ | Legacy installation | |
| 166 | + |
| 167 | +## Best Practices |
| 168 | + |
| 169 | +1. **Test Updates**: Always test in staging environment first |
| 170 | +2. **Gradual Rollouts**: Use rolling updates with conservative settings |
| 171 | +3. **Monitor Metrics**: Verify metrics collection after updates |
| 172 | +4. **Backup Values**: Keep your custom `values.yaml` in version control |
| 173 | +5. **Version Pinning**: Pin specific versions in production |
| 174 | +6. **Health Checks**: Configure proper readiness and liveness probes |
| 175 | +7. **Alerts**: Set up monitoring for failed deployments |
| 176 | + |
| 177 | +## Getting Help |
| 178 | + |
| 179 | +- **Chart Issues**: [Kepler GitHub Issues](https://github.com/sustainable-computing-io/kepler/issues) |
| 180 | +- **Registry Issues**: [Quay.io Support](https://access.redhat.com/support) |
| 181 | +- **Helm Issues**: [Helm Documentation](https://helm.sh/docs/) |
0 commit comments