|
| 1 | +# PodCliqueSet In-Place Restart Guide |
| 2 | + |
| 3 | +This guide uses Kubernetes 1.35+ **RestartAllContainers** to trigger in-place restarts of all Pods in a Grove **PodCliqueSet** via a single **ConfigMap** field `restartGeneration`. Pod names, UIDs, and IPs stay the same (no rescheduling). |
| 4 | + |
| 5 | +## Use case: restart without rescheduling |
| 6 | + |
| 7 | +Sometimes we want a **PodCliqueSet** (or all its Pods) to restart **without going through rescheduling**—for example when **upgrading container image versions** or when we need a clean re-run of init containers and main containers while keeping the same Pod identity and placement. |
| 8 | + |
| 9 | +Deleting and recreating Pods is costly: it involves the scheduler, node allocation, and re-initialization of networking and storage. Kubernetes 1.35’s [Restart All Containers](https://kubernetes.io/blog/2026/01/02/kubernetes-v1-35-restart-all-containers/) feature provides an **in-place** restart instead: the kubelet restarts all containers in the Pod while preserving the Pod’s UID, IP address, volumes, and node assignment. Init containers run again in order, then all main containers start with a fresh state—so an image update or configuration change can take effect without any rescheduling. This guide shows how to trigger that in-place restart for an entire Grove PodCliqueSet at once via a ConfigMap. |
| 10 | + |
| 11 | +### Limitations |
| 12 | + |
| 13 | +This guide applies only to **restarting PodCliqueSets** (in-place restart of all Pods belonging to a Grove PodCliqueSet). It does not cover other workload types or cluster-wide restart scenarios. |
| 14 | + |
| 15 | +## Idea |
| 16 | + |
| 17 | +- Each Pod runs a **restart-watcher** sidecar (Go). It uses in-cluster config to poll the ConfigMap `grove-restart-control` in the same namespace for the key `restartGeneration`. |
| 18 | +- When it sees `restartGeneration` **increase**, the watcher exits with a configured code (default 88), which triggers **RestartAllContainers** for that Pod. |
| 19 | +- To trigger a batch in-place restart, **kubectl patch** the ConfigMap to increment `restartGeneration`; all Pods with the watcher will see the new value on the next poll and restart in place. |
| 20 | + |
| 21 | +## Directory layout |
| 22 | + |
| 23 | +``` |
| 24 | +04_restart-all-containers-for-podcliqueset/ |
| 25 | +├── src/ |
| 26 | +│ ├── main.go # restart-watcher sidecar source |
| 27 | +│ └── go.mod |
| 28 | +├── Dockerfile # build watcher image |
| 29 | +├── Makefile # build and push image |
| 30 | +├── manifests/ |
| 31 | +│ ├── namespace.yaml |
| 32 | +│ ├── rbac.yaml # SA + Role + RoleBinding (read ConfigMap) |
| 33 | +│ ├── configmap.yaml |
| 34 | +│ └── podcliqueset.yaml |
| 35 | +└── README.md |
| 36 | +``` |
| 37 | + |
| 38 | +## Prerequisites |
| 39 | + |
| 40 | +1. **Cluster**: Kubernetes **1.35+** with **RestartAllContainersOnContainerExits** and **NodeDeclaredFeatures** enabled. Both feature gates must be enabled on **both** the API server and the kubelet. **RestartAllContainersOnContainerExits** depends on **NodeDeclaredFeatures**, so enable them together. See your cluster or distribution docs for how to set feature gates. |
| 41 | +2. **Grove**: CRD and Operator installed, **v0.1.0-alpha.4 or later**. |
| 42 | +3. **Registry**: A Docker registry you can push the `restart-watcher` image to and that cluster nodes can pull from. |
| 43 | + |
| 44 | +## Steps |
| 45 | + |
| 46 | +### 1. Build and push the restart-watcher image |
| 47 | + |
| 48 | +From this guide's directory: |
| 49 | + |
| 50 | +```bash |
| 51 | +# Set your registry (required) |
| 52 | +export REGISTRY=your-registry.io/your-user |
| 53 | +export IMAGE_TAG=latest |
| 54 | + |
| 55 | +make push |
| 56 | +``` |
| 57 | + |
| 58 | +Note the image name, e.g. `$(REGISTRY)/restart-watcher:$(IMAGE_TAG)`. |
| 59 | + |
| 60 | +### 2. Set the watcher image in the PodCliqueSet |
| 61 | + |
| 62 | +Edit `manifests/podcliqueset.yaml` and replace both `WATCHER_IMAGE` with the image you pushed, e.g.: |
| 63 | + |
| 64 | +```bash |
| 65 | +sed -i "s|WATCHER_IMAGE|${REGISTRY}/restart-watcher:${IMAGE_TAG}|g" manifests/podcliqueset.yaml |
| 66 | +``` |
| 67 | + |
| 68 | +Or change `image: WATCHER_IMAGE` to e.g. `image: your-registry.io/your-user/restart-watcher:latest` by hand. |
| 69 | + |
| 70 | +### 3. Deploy PodCliqueSet and ConfigMap |
| 71 | + |
| 72 | +```bash |
| 73 | +kubectl apply -f manifests/namespace.yaml |
| 74 | +kubectl apply -f manifests/rbac.yaml |
| 75 | +kubectl apply -f manifests/configmap.yaml |
| 76 | +kubectl apply -f manifests/podcliqueset.yaml |
| 77 | +``` |
| 78 | + |
| 79 | +The [example PodCliqueSet](manifests/podcliqueset.yaml) has two PodCliques, **pca** and **pcb**, and both include the **restart-watcher** sidecar so that incrementing `restartGeneration` restarts all 6 Pods. If you only want to restart one PodClique, add the restart-watcher sidecar only to that clique’s `podSpec` in the manifest; Pods without the sidecar will not react to the ConfigMap. |
| 80 | + |
| 81 | +### 4. Wait for Pods to be ready |
| 82 | + |
| 83 | +The PodCliqueSet has **pca** (replicas=2) and **pcb** (replicas=4), 6 Pods in total: |
| 84 | + |
| 85 | +```bash |
| 86 | +kubectl get podcliqueset -n grove-restart-demo |
| 87 | +kubectl get pods -n grove-restart-demo -l app.kubernetes.io/part-of=grove-restart-demo-pcs -o wide |
| 88 | +``` |
| 89 | + |
| 90 | +Confirm all 6 Pods are `Running`. |
| 91 | + |
| 92 | +### 5. (Optional) Record Pod name, Pod ID, and IP |
| 93 | + |
| 94 | +To compare before and after the trigger (names, UIDs, and IPs should stay the same): |
| 95 | + |
| 96 | +```bash |
| 97 | +kubectl get pods -n grove-restart-demo -l app.kubernetes.io/part-of=grove-restart-demo-pcs \ |
| 98 | + -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.metadata.uid}{"\t"}{.status.podIP}{"\n"}{end}' |
| 99 | +``` |
| 100 | + |
| 101 | +### 6. Trigger one “full PodCliqueSet in-place restart” |
| 102 | + |
| 103 | +Increment the ConfigMap `grove-restart-control` key `restartGeneration`: |
| 104 | + |
| 105 | +```bash |
| 106 | +# Get current value |
| 107 | +current=$(kubectl get configmap grove-restart-control -n grove-restart-demo -o jsonpath='{.data.restartGeneration}') |
| 108 | +next=$((current + 1)) |
| 109 | + |
| 110 | +# Patch with next value |
| 111 | +kubectl patch configmap grove-restart-control -n grove-restart-demo \ |
| 112 | + --type merge \ |
| 113 | + -p "{\"data\":{\"restartGeneration\":\"${next}\"}}" |
| 114 | +``` |
| 115 | + |
| 116 | +Within the next poll interval (default 5 seconds), all Pods with the restart-watcher will see the new value, exit with 88, and trigger RestartAllContainers for their Pod. |
| 117 | + |
| 118 | +### 7. Observe |
| 119 | + |
| 120 | +After about 10–20 seconds: |
| 121 | + |
| 122 | +```bash |
| 123 | +kubectl get pods -n grove-restart-demo -l app.kubernetes.io/part-of=grove-restart-demo-pcs -o wide |
| 124 | +``` |
| 125 | + |
| 126 | +**Expected**: |
| 127 | + |
| 128 | +- Pod **names, UIDs and IPs unchanged** (no new Pods created or deleted). |
| 129 | +- **Restart counts** increased (e.g. `kubectl get pod <name> -n grove-restart-demo -o jsonpath='{range .status.containerStatuses[*]}{.name} restarts={.restartCount}{"\n"}{end}'`). |
| 130 | + |
| 131 | +To trigger again, repeat step 6 (increment `restartGeneration` again). |
| 132 | + |
| 133 | +### 8. Cleanup |
| 134 | + |
| 135 | +```bash |
| 136 | +kubectl delete -f manifests/podcliqueset.yaml |
| 137 | +kubectl delete -f manifests/configmap.yaml |
| 138 | +kubectl delete -f manifests/rbac.yaml |
| 139 | +kubectl delete -f manifests/namespace.yaml |
| 140 | +``` |
| 141 | + |
| 142 | +## Environment variables (restart-watcher) |
| 143 | + |
| 144 | +| Variable | Meaning | Default | |
| 145 | +|----------|---------|--------| |
| 146 | +| `CM_NAMESPACE` | ConfigMap namespace | Prefer `metadata.namespace` via fieldRef | |
| 147 | +| `CM_NAME` | ConfigMap name | `grove-restart-control` | |
| 148 | +| `KEY_NAME` | Key name | `restartGeneration` | |
| 149 | +| `POLL_INTERVAL_SECONDS` | Poll interval (seconds) | `5` | |
| 150 | +| `TRIGGER_EXIT_CODE` | Exit code that triggers RestartAllContainers | `88` | |
| 151 | + |
| 152 | +## References |
| 153 | + |
| 154 | +- Kubernetes 1.35: [Restart All Containers](https://kubernetes.io/blog/2026/01/02/kubernetes-v1-35-restart-all-containers/), [KEP-5532](https://kep.k8s.io/5532) |
0 commit comments