Skip to content

Commit e226576

Browse files
committed
Add user guide for RestartAllContainers
Signed-off-by: Daisy Guo <daiguo@nvidia.com>
1 parent 3457783 commit e226576

File tree

10 files changed

+658
-0
lines changed

10 files changed

+658
-0
lines changed
Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
# Build stage
2+
FROM golang:1.21-alpine AS builder
3+
WORKDIR /app
4+
COPY src/ ./
5+
RUN go mod tidy && CGO_ENABLED=0 GOOS=linux GOARCH=amd64 go build -o /restart-watcher .
6+
7+
# Runtime stage
8+
FROM alpine:3.19
9+
RUN apk --no-cache add ca-certificates
10+
COPY --from=builder /restart-watcher /restart-watcher
11+
ENTRYPOINT ["/restart-watcher"]
Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
# restart-watcher image build and push
2+
# Override for your registry: make push REGISTRY=myreg.io/myuser
3+
REGISTRY ?= localhost:5000
4+
IMAGE_NAME ?= restart-watcher
5+
IMAGE_TAG ?= latest
6+
FULL_IMAGE := $(REGISTRY)/$(IMAGE_NAME):$(IMAGE_TAG)
7+
8+
.PHONY: build image push all tidy
9+
10+
# Build Go binary locally (for development)
11+
build:
12+
cd src && go build -o ../bin/restart-watcher .
13+
14+
# Build Docker image (binary built inside container)
15+
image:
16+
docker build -t $(FULL_IMAGE) -f Dockerfile .
17+
@echo "Built $(FULL_IMAGE)"
18+
19+
# Push image to registry
20+
push: image
21+
docker push $(FULL_IMAGE)
22+
@echo "Pushed $(FULL_IMAGE)"
23+
24+
# Tidy go.mod (run from repo root)
25+
tidy:
26+
cd src && go mod tidy
27+
28+
all: image
Lines changed: 148 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,148 @@
1+
# Grove In-Place Restart Demo (ConfigMap-Triggered)
2+
3+
This demo uses Kubernetes 1.35+ **RestartAllContainers** to trigger in-place restarts of all Pods in a Grove **PodCliqueSet** via a single **ConfigMap** field `restartGeneration`. Pod names, IDs, and IPs stay the same (no rescheduling).
4+
5+
## Use case: restart without rescheduling
6+
7+
Sometimes we want a **PodCliqueSet** (or all its Pods) to restart **without going through rescheduling**—for example when **upgrading container image versions** or when we need a clean re-run of init containers and main containers while keeping the same Pod identity and placement.
8+
9+
Deleting and recreating Pods is costly: it involves the scheduler, node allocation, and re-initialization of networking and storage. Kubernetes 1.35’s [Restart All Containers](https://kubernetes.io/blog/2026/01/02/kubernetes-v1-35-restart-all-containers/) feature provides an **in-place** restart instead: the kubelet restarts all containers in the Pod while preserving the Pod’s UID, IP address, volumes, and node assignment. Init containers run again in order, then all main containers start with a fresh state—so an image update or configuration change can take effect without any rescheduling. This demo shows how to trigger that in-place restart for an entire Grove PodCliqueSet at once via a ConfigMap.
10+
11+
## Idea
12+
13+
- Each Pod runs a **restart-watcher** sidecar (Go). It uses in-cluster config to poll the ConfigMap `grove-restart-control` in the same namespace for the key `restartGeneration`.
14+
- When it sees `restartGeneration` **increase**, the watcher exits with a configured code (default 88), which triggers **RestartAllContainers** for that Pod.
15+
- To trigger a batch in-place restart, **kubectl patch** the ConfigMap to increment `restartGeneration`; all Pods with the watcher will see the new value on the next poll and restart in place.
16+
17+
## Directory layout
18+
19+
```
20+
imageupdate_demo/
21+
├── src/
22+
│ ├── main.go # restart-watcher sidecar source
23+
│ └── go.mod
24+
├── Dockerfile # build watcher image
25+
├── Makefile # build and push image
26+
├── manifests/
27+
│ ├── namespace.yaml
28+
│ ├── rbac.yaml # SA + Role + RoleBinding (read ConfigMap)
29+
│ ├── configmap.yaml
30+
│ └── podcliqueset.yaml
31+
└── README.md
32+
```
33+
34+
## Prerequisites
35+
36+
1. **Cluster**: Kubernetes **1.35+** with **RestartAllContainersOnContainerExits** and **NodeDeclaredFeatures** enabled. Both feature gates must be enabled on **both** the API server and the kubelet. **RestartAllContainersOnContainerExits** depends on **NodeDeclaredFeatures**, so enable them together. See your cluster or distribution docs for how to set feature gates.
37+
2. **Grove**: CRD and Operator installed. If you use features like `startsAfter`, configure **GROVE_INIT_CONTAINER_IMAGE**.
38+
3. **Registry**: A Docker registry you can push the `restart-watcher` image to and that cluster nodes can pull from.
39+
40+
## Demo steps
41+
42+
### 1. Build and push the restart-watcher image
43+
44+
From the repo root `imageupdate_demo/`:
45+
46+
```bash
47+
# Set your registry (required)
48+
export REGISTRY=your-registry.io/your-user
49+
export IMAGE_TAG=latest
50+
51+
make push
52+
```
53+
54+
Note the image name, e.g. `$(REGISTRY)/restart-watcher:$(IMAGE_TAG)`.
55+
56+
### 2. Set the watcher image in the PodCliqueSet
57+
58+
Edit `manifests/podcliqueset.yaml` and replace both `WATCHER_IMAGE` with the image you pushed, e.g.:
59+
60+
```bash
61+
sed -i "s|WATCHER_IMAGE|${REGISTRY}/restart-watcher:${IMAGE_TAG}|g" manifests/podcliqueset.yaml
62+
```
63+
64+
Or change `image: WATCHER_IMAGE` to e.g. `image: your-registry.io/your-user/restart-watcher:latest` by hand.
65+
66+
### 3. Deploy namespace, RBAC, ConfigMap, and PodCliqueSet
67+
68+
```bash
69+
kubectl apply -f manifests/namespace.yaml
70+
kubectl apply -f manifests/rbac.yaml
71+
kubectl apply -f manifests/configmap.yaml
72+
kubectl apply -f manifests/podcliqueset.yaml
73+
```
74+
75+
### 4. Wait for Pods to be ready
76+
77+
The PodCliqueSet has **pca** (replicas=2) and **pcb** (replicas=4), 6 Pods in total:
78+
79+
```bash
80+
kubectl get podcliqueset -n grove-restart-demo
81+
kubectl get pods -n grove-restart-demo -l app.kubernetes.io/part-of=grove-restart-demo-pcs -o wide
82+
```
83+
84+
Confirm all 6 Pods are `Running`.
85+
86+
### 5. (Optional) Record Pod name, Pod ID, and IP
87+
88+
To compare before and after the trigger (names, UIDs, and IPs should stay the same):
89+
90+
```bash
91+
kubectl get pods -n grove-restart-demo -l app.kubernetes.io/part-of=grove-restart-demo-pcs \
92+
-o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.metadata.uid}{"\t"}{.status.podIP}{"\n"}{end}'
93+
```
94+
95+
### 6. Trigger one “full PodCliqueSet in-place restart”
96+
97+
Increment the ConfigMap `grove-restart-control` key `restartGeneration`:
98+
99+
```bash
100+
# Get current value
101+
current=$(kubectl get configmap grove-restart-control -n grove-restart-demo -o jsonpath='{.data.restartGeneration}')
102+
next=$((current + 1))
103+
104+
# Patch with next value
105+
kubectl patch configmap grove-restart-control -n grove-restart-demo \
106+
--type merge \
107+
-p "{\"data\":{\"restartGeneration\":\"${next}\"}}"
108+
```
109+
110+
Within the next poll interval (default 5 seconds), all Pods with the restart-watcher will see the new value, exit with 88, and trigger RestartAllContainers for their Pod.
111+
112+
### 7. Observe
113+
114+
After about 10–20 seconds:
115+
116+
```bash
117+
kubectl get pods -n grove-restart-demo -l app.kubernetes.io/part-of=grove-restart-demo-pcs -o wide
118+
```
119+
120+
**Expected**:
121+
122+
- Pod **names and IPs unchanged** (no new Pods created or deleted).
123+
- **Restart counts** increased (e.g. `kubectl get pod <name> -n grove-restart-demo -o jsonpath='{range .status.containerStatuses[*]}{.name} restarts={.restartCount}{"\n"}{end}'`).
124+
125+
To trigger again, repeat step 6 (increment `restartGeneration` again).
126+
127+
### 8. Cleanup
128+
129+
```bash
130+
kubectl delete -f manifests/podcliqueset.yaml
131+
kubectl delete -f manifests/configmap.yaml
132+
kubectl delete -f manifests/rbac.yaml
133+
kubectl delete -f manifests/namespace.yaml
134+
```
135+
136+
## Environment variables (restart-watcher)
137+
138+
| Variable | Meaning | Default |
139+
|----------|---------|--------|
140+
| `CM_NAMESPACE` | ConfigMap namespace | Prefer `metadata.namespace` via fieldRef |
141+
| `CM_NAME` | ConfigMap name | `grove-restart-control` |
142+
| `KEY_NAME` | Key name | `restartGeneration` |
143+
| `POLL_INTERVAL_SECONDS` | Poll interval (seconds) | `5` |
144+
| `TRIGGER_EXIT_CODE` | Exit code that triggers RestartAllContainers | `88` |
145+
146+
## References
147+
148+
- Kubernetes 1.35: [Restart All Containers](https://kubernetes.io/blog/2026/01/02/kubernetes-v1-35-restart-all-containers/), [KEP-5532](https://kep.k8s.io/5532)
Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
# ConfigMap watched by restart-watcher sidecars.
2+
# Increment data.restartGeneration to trigger in-place RestartAllContainers on all Pods
3+
# that have the watcher sidecar.
4+
apiVersion: v1
5+
kind: ConfigMap
6+
metadata:
7+
name: grove-restart-control
8+
namespace: grove-restart-demo
9+
data:
10+
restartGeneration: "0"
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
apiVersion: v1
2+
kind: Namespace
3+
metadata:
4+
name: grove-restart-demo
5+
labels:
6+
app.kubernetes.io/name: grove-restart-demo
Lines changed: 96 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,96 @@
1+
# Grove PodCliqueSet: pca (nginx:1.25, replicas=2), pcb (nginx:1.25, replicas=4).
2+
# Each Pod has a restart-watcher sidecar that watches ConfigMap grove-restart-control.
3+
# When restartGeneration is incremented, watchers exit with code 88 and trigger
4+
# RestartAllContainers (K8s 1.35+). Replace WATCHER_IMAGE with your built image.
5+
#
6+
# Prerequisites:
7+
# - Kubernetes 1.35+ with RestartAllContainersOnContainerExits and NodeDeclaredFeatures
8+
# - Grove CRD + Operator installed
9+
---
10+
apiVersion: grove.io/v1alpha1
11+
kind: PodCliqueSet
12+
metadata:
13+
name: grove-restart-demo-pcs
14+
namespace: grove-restart-demo
15+
labels:
16+
app: grove-restart-demo
17+
spec:
18+
replicas: 1
19+
template:
20+
terminationDelay: 1m
21+
cliqueStartupType: CliqueStartupTypeExplicit
22+
cliques:
23+
- name: pca
24+
spec:
25+
roleName: rolea
26+
replicas: 2
27+
podSpec:
28+
restartPolicy: Always
29+
serviceAccountName: grove-restart-demo
30+
containers:
31+
- name: main
32+
image: nginx:1.25
33+
ports:
34+
- containerPort: 80
35+
resources:
36+
requests:
37+
cpu: 10m
38+
- name: restart-watcher
39+
image: WATCHER_IMAGE
40+
imagePullPolicy: IfNotPresent
41+
env:
42+
- name: CM_NAMESPACE
43+
valueFrom:
44+
fieldRef:
45+
fieldPath: metadata.namespace
46+
- name: CM_NAME
47+
value: "grove-restart-control"
48+
- name: KEY_NAME
49+
value: "restartGeneration"
50+
- name: POLL_INTERVAL_SECONDS
51+
value: "5"
52+
- name: TRIGGER_EXIT_CODE
53+
value: "88"
54+
restartPolicy: Always
55+
restartPolicyRules:
56+
- action: RestartAllContainers
57+
exitCodes:
58+
operator: In
59+
values: [88]
60+
- name: pcb
61+
spec:
62+
roleName: roleb
63+
replicas: 4
64+
podSpec:
65+
restartPolicy: Always
66+
serviceAccountName: grove-restart-demo
67+
containers:
68+
- name: main
69+
image: nginx:1.25
70+
ports:
71+
- containerPort: 80
72+
resources:
73+
requests:
74+
cpu: 10m
75+
- name: restart-watcher
76+
image: WATCHER_IMAGE
77+
imagePullPolicy: IfNotPresent
78+
env:
79+
- name: CM_NAMESPACE
80+
valueFrom:
81+
fieldRef:
82+
fieldPath: metadata.namespace
83+
- name: CM_NAME
84+
value: "grove-restart-control"
85+
- name: KEY_NAME
86+
value: "restartGeneration"
87+
- name: POLL_INTERVAL_SECONDS
88+
value: "5"
89+
- name: TRIGGER_EXIT_CODE
90+
value: "88"
91+
restartPolicy: Always
92+
restartPolicyRules:
93+
- action: RestartAllContainers
94+
exitCodes:
95+
operator: In
96+
values: [88]
Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
# ServiceAccount and RBAC for Pods that run the restart-watcher sidecar.
2+
# The watcher needs to read the ConfigMap "grove-restart-control" in the same namespace.
3+
---
4+
apiVersion: v1
5+
kind: ServiceAccount
6+
metadata:
7+
name: grove-restart-demo
8+
namespace: grove-restart-demo
9+
---
10+
apiVersion: rbac.authorization.k8s.io/v1
11+
kind: Role
12+
metadata:
13+
name: grove-restart-configmap-reader
14+
namespace: grove-restart-demo
15+
rules:
16+
- apiGroups: [""]
17+
resources: ["configmaps"]
18+
resourceNames: ["grove-restart-control"]
19+
verbs: ["get", "list", "watch"]
20+
---
21+
apiVersion: rbac.authorization.k8s.io/v1
22+
kind: RoleBinding
23+
metadata:
24+
name: grove-restart-demo-read-configmap
25+
namespace: grove-restart-demo
26+
roleRef:
27+
apiGroup: rbac.authorization.k8s.io
28+
kind: Role
29+
name: grove-restart-configmap-reader
30+
subjects:
31+
- kind: ServiceAccount
32+
name: grove-restart-demo
33+
namespace: grove-restart-demo
Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
module github.com/daisy/restart-watcher
2+
3+
go 1.21
4+
5+
require (
6+
k8s.io/api v0.28.4
7+
k8s.io/apimachinery v0.28.4
8+
k8s.io/client-go v0.28.4
9+
)
10+
11+
require (
12+
github.com/davecgh/go-spew v1.1.1 // indirect
13+
github.com/emicklei/go-restful/v3 v3.9.0 // indirect
14+
github.com/go-logr/logr v1.2.4 // indirect
15+
github.com/go-openapi/jsonpointer v0.19.6 // indirect
16+
github.com/go-openapi/jsonreference v0.20.2 // indirect
17+
github.com/go-openapi/swag v0.22.3 // indirect
18+
github.com/gogo/protobuf v1.3.2 // indirect
19+
github.com/golang/protobuf v1.5.3 // indirect
20+
github.com/google/gnostic-models v0.6.8 // indirect
21+
github.com/google/go-cmp v0.5.9 // indirect
22+
github.com/google/gofuzz v1.2.0 // indirect
23+
github.com/google/uuid v1.3.0 // indirect
24+
github.com/josharian/intern v1.0.0 // indirect
25+
github.com/json-iterator/go v1.1.12 // indirect
26+
github.com/mailru/easyjson v0.7.7 // indirect
27+
github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd // indirect
28+
github.com/modern-go/reflect2 v1.0.2 // indirect
29+
github.com/munnerz/goautoneg v0.0.0-20191010083416-a7dc8b61c822 // indirect
30+
golang.org/x/net v0.17.0 // indirect
31+
golang.org/x/oauth2 v0.8.0 // indirect
32+
golang.org/x/sys v0.13.0 // indirect
33+
golang.org/x/term v0.13.0 // indirect
34+
golang.org/x/text v0.13.0 // indirect
35+
golang.org/x/time v0.3.0 // indirect
36+
google.golang.org/appengine v1.6.7 // indirect
37+
google.golang.org/protobuf v1.31.0 // indirect
38+
gopkg.in/inf.v0 v0.9.1 // indirect
39+
gopkg.in/yaml.v2 v2.4.0 // indirect
40+
gopkg.in/yaml.v3 v3.0.1 // indirect
41+
k8s.io/klog/v2 v2.100.1 // indirect
42+
k8s.io/kube-openapi v0.0.0-20230717233707-2695361300d9 // indirect
43+
k8s.io/utils v0.0.0-20230406110748-d93618cff8a2 // indirect
44+
sigs.k8s.io/json v0.0.0-20221116044647-bc3834ca7abd // indirect
45+
sigs.k8s.io/structured-merge-diff/v4 v4.2.3 // indirect
46+
sigs.k8s.io/yaml v1.3.0 // indirect
47+
)

0 commit comments

Comments
 (0)