Skip to content

Commit 760c51d

Browse files
Updated API to lifecycle.StopSignal instead of lifecycle.stop.signal
1 parent 710611b commit 760c51d

File tree

1 file changed

+25
-41
lines changed
  • keps/sig-node/4960-container-stop-signals

1 file changed

+25
-41
lines changed

keps/sig-node/4960-container-stop-signals/README.md

Lines changed: 25 additions & 41 deletions
Original file line numberDiff line numberDiff line change
@@ -68,7 +68,7 @@ Items marked with (R) are required *prior to targeting to a milestone / release*
6868

6969
## Summary
7070

71-
Container runtimes let you define a [STOPSIGNAL](https://docs.docker.com/reference/dockerfile/#stopsignal) to let your container images change which signal is delivered to kill the container. Currently you can only configure this by defining STOPSIGNAL in the container image definition file before you build the image. This becomes difficult to change when you’re using prebuilt images. This KEP proposes to add support to configure custom stop signals for containers from the ContainerSpec. Kubernetes has no equivalent for STOPSIGNAL as part of Pod or Container APIs. This KEP proposes to add support to configure custom stop signals for containers from the ContainerSpec.
71+
Container runtimes let you define a [STOPSIGNAL](https://docs.docker.com/reference/dockerfile/#stopsignal) to let your container images change which signal is delivered to kill the container. Currently you can only configure this by defining STOPSIGNAL in the container image definition file before you build the image. This becomes difficult to change when you’re using prebuilt images. Kubernetes has no equivalent for STOPSIGNAL as part of Pod or Container APIs. This KEP proposes to add support to configure custom stop signals for containers from the ContainerSpec.
7272

7373
## Motivation
7474

@@ -80,9 +80,9 @@ Having stop signal as a first class citizen in the Pod's container specification
8080

8181
### Goals
8282

83-
- Add a new Stop lifecycle handler to container lifecycle which can be configured with a Signal option
83+
- Add a new Stop lifecycle handler to container lifecycle which can be configured with a Signal option, which takes a string value
8484
- Update the CRI API to pass down the stop signal to the container runtime via ContainerConfig
85-
- Update the implementation of the StopContainer method in container runtimes to use the container’s stop signal (if defined) to kill containers
85+
- Update the implementation of the StopContainer method in container runtimes to use the container’s stop signal defined in the container spec (if present) to kill containers
8686
- Add support to show the effective stop signal of containers in the container status field in the pod status
8787

8888
### Non-Goals
@@ -93,22 +93,16 @@ Having stop signal as a first class citizen in the Pod's container specification
9393

9494
### API
9595

96-
A new Stop lifecycle handler will be added to container lifecycle. The Stop lifecycle event can be configured with a Signal option, which is of type `Signal`. This new `Signal` type can take a string value, and can be used to define a stop signal for containers when creating Pods. `Signal` will hold string values which can be mapped to Go's syscall.Signal. For example, [list of signals supported in Linux environments by moby](https://github.com/containerd/containerd/blob/main/vendor/github.com/moby/sys/signal/signal_linux.go). If the user doesn't define a particular stop signal, the behaviour would default to what it is today and fallback to the stop signal defined in the container image or use the default stop signal of the container runtime (SIGTERM in case of containerd, CRI-O).
96+
A new StopSignal lifecycle hook will be added to container lifecycle. The StopSignal lifecycle hook can be configured with a signal, which is of type `Signal`. This new `Signal` type can take a string value, and can be used to define a stop signal for containers when creating Pods. `Signal` will hold string values which can be mapped to Go's syscall.Signal. For example, see the [list of signals supported in Linux environments by moby](https://github.com/containerd/containerd/blob/main/vendor/github.com/moby/sys/signal/signal_linux.go). If the user doesn't define a particular stop signal, the behaviour would default to what it is today and fallback to the stop signal defined in the container image or use the default stop signal of the container runtime (SIGTERM in case of containerd, CRI-O).
9797

9898
```go
9999
// pkg/apis/core/types.go
100100
type Signal string //parseable into Go's syscall.Signal
101101

102-
type LifecycleHandler struct {
103-
// ...
104-
// +optional
105-
Signal *Signal
106-
}
107-
108102
type Lifecycle struct {
109103
// ...
110-
// +optional
111-
Stop *LifecycleHandler
104+
// +optional
105+
StopSignal *Signal
112106
}
113107
```
114108

@@ -124,8 +118,7 @@ spec:
124118
- name: nginx
125119
image: nginx:1.14.2
126120
lifecycle:
127-
stop:
128-
signal: SIGUSR1
121+
stopSignal: SIGUSR1
129122
```
130123
131124
The stop signal would also be shown in the containers' status. The value of the stop signal shown in the status can be from the spec, if a stop cycle is defined in the spec, else it will be the effective stop signal which is used by the container runtime to kill your container. This can either be read from the container image or will be the default stop signal of the container runtime. Users will be able to see a container's stop signal in its status even if they're not using a custom stop signal from the spec.
@@ -137,8 +130,7 @@ status:
137130
image: nginx:1.14.2
138131
lastState: {}
139132
lifecyle:
140-
stop:
141-
signal: SIGUSR1
133+
stopSignal: SIGUSR1
142134
name: redis
143135
ready: true
144136
restartCount: 0
@@ -161,11 +153,7 @@ message ContainerConfig {
161153
}
162154

163155
+message Lifecycle {
164-
+ LifecycleHandler stop = 1;
165-
+}
166-
167-
+message LifecycleHandler {
168-
+ string signal = 1;
156+
+ string stop_signal = 1;
169157
+}
170158
```
171159

@@ -196,9 +184,7 @@ func (m *kubeGenericRuntimeManager) generateContainerConfig(...) (*runtimeapi.Co
196184
StdinOnce: container.StdinOnce,
197185
Tty: container.TTY,
198186
+ Lifecycle: &runtimeapi.Lifecycle{
199-
+ Stop: &runtimeapi.LifecycleHandler{
200-
+ Signal: container.Lifecycle.Stop.Signal
201-
+ }
187+
+ StopSignal: container.Lifecycle.StopSignal
202188
+ },
203189
}
204190
// ...
@@ -207,19 +193,19 @@ func (m *kubeGenericRuntimeManager) generateContainerConfig(...) (*runtimeapi.Co
207193

208194
Since the new stop lifecycle is optional, the default stop signal for a container can be unset or nil. In this case, the container runtime will fallback to the existing behaviour.
209195

210-
Additionally, the stop signal would also be added to `ContainerStatus` (as `containerStatus[].Lifecycle.Stop.Signal`) so that we can pass the stop signal extracted from the image/container runtime back to the container status at the Kubernetes API level.
196+
Additionally, the stop signal would also be added to `ContainerStatus` (as `containerStatus[].Lifecycle.StopSignal`) so that we can pass the stop signal extracted from the image/container runtime back to the container status at the Kubernetes API level.
211197

212198
### Container runtime changes
213199

214-
Once the stop signal from `containerSpec.Lifecycle.Stop.Signal` is passed down to the container runtime via `ContainerConfig` during creation/updation of the container, we can use the value of the stop signal from the container runtime's implementation of `stopContainer` method. In the case of containerd, it would look like this:
200+
Once the stop signal from `containerSpec.Lifecycle.StopSignal` is passed down to the container runtime via `ContainerConfig` during creation/updation of the container, we can use the value of the stop signal from the container runtime's implementation of `stopContainer` method. In the case of containerd, it would look like this:
215201

216202
```diff
217203
//internal/cri/server/container_stop.go
218204

219205
func (c *criService) StopContainer(ctx context.Context, r *runtime.StopContainerRequest) (*runtime.StopContainerResponse, error) {
220206
// ...
221207
- if err := c.stopContainer(ctx, container, time.Duration(r.GetTimeout())*time.Second); err != nil {
222-
+ if err := c.stopContainer(ctx, container, time.Duration(r.GetTimeout())*time.Second, container.Config.Lifecycle.Stop.Signal); err != nil {
208+
+ if err := c.stopContainer(ctx, container, time.Duration(r.GetTimeout())*time.Second, container.Config.Lifecycle.StopSignal); err != nil {
223209
return nil, err
224210
}
225211
// ...
@@ -239,9 +225,9 @@ func (c *criService) StopContainer(ctx context.Context, r *runtime.StopContainer
239225
// rest of the code...
240226
```
241227

242-
The signal that we get from `ContainerConfig` can be validated with [ParseSignal](https://github.com/containerd/containerd/blob/main/vendor/github.com/moby/sys/signal/signal.go#L38) to further validate that we've received a valid stop signal. Also `container.StopSignal` is reading the stop signal from the image. We can add another condition before that to use the stop signal defined in spec if there is one. If nothing is defined in the spec ("" or unset), containerd behaves like how it is today. Also note that `SIGTERM` is hardcoded in containerd's stopContainer method as the default stop signal to fallback to, in case the image doesn't defined a stop signal. Similar logic in also present in CRI-O [here](https://github.com/cri-o/cri-o/blob/main/internal/oci/container.go#L259-L272).
228+
The signal that we get from `ContainerConfig` can be validated with [ParseSignal](https://github.com/containerd/containerd/blob/main/vendor/github.com/moby/sys/signal/signal.go#L38) to further validate that we've received a valid stop signal. Also here `container.StopSignal` is reading the stop signal from the image. We can add another condition before that to use the stop signal defined in spec if there is one. If nothing is defined in the spec ("" or unset), containerd behaves like how it is today. Also note that `SIGTERM` is hardcoded in containerd's stopContainer method as the default stop signal to fallback to, in case the image doesn't defined a stop signal. Similar logic in also present in CRI-O [here](https://github.com/cri-o/cri-o/blob/main/internal/oci/container.go#L259-L272).
243229

244-
Find the entire diff for containerd [here](https://github.com/containerd/containerd/compare/main...sreeram-venkitesh:containerd:added-custom-stop-signal?expand=1).
230+
Find the entire diff for containerd which was done for the POC [here](https://github.com/containerd/containerd/compare/main...sreeram-venkitesh:containerd:added-custom-stop-signal?expand=1).
245231

246232
### User Stories (Optional)
247233

@@ -256,8 +242,8 @@ Kubernetes by default sends a SIGTERM to all containers while killing them. When
256242
## Design Details
257243

258244
On top of the details described in the [Proposal](#proposal), these are some details on how exactly the new field will work.
259-
- `ContainerSpec.Lifecycle.Stop.Signal` is totally optional and can be a nil value. In this case, the stop signal defined in the container image or the container runtime's default stop signal (SIGTERM for containerd and CRI-O) would be used.
260-
- If set, `ContainerSpec.Lifecycle.Stop.Signal` will override the stop signal set from the container image definition.
245+
- `ContainerSpec.Lifecycle.StopSignal` is totally optional and can be a nil value. In this case, the stop signal defined in the container image or the container runtime's default stop signal (SIGTERM for containerd and CRI-O) would be used.
246+
- If set, `ContainerSpec.Lifecycle.StopSignal` will override the stop signal set from the container image definition.
261247
- The order of priority for the different stop signals would look like this
262248
`Stop signal from Container Spec > STOPSIGNAL from container image > Default stop signal of container runtime`
263249

@@ -272,14 +258,12 @@ to implement this enhancement.
272258
##### Unit tests
273259

274260
Alpha:
275-
- Test that the validation fails when given a non string value for container lifecycle stop hook's signal field
261+
- Test that the validation fails when given a non string value for container lifecycle StopSignal hook
276262
- Test that the validation passes when given a proper string value representing a standard stop signal
277263
- Test that the validation fails when we configure a custom stop signal with the feature gate disabled
278264
- Test that the validation returns the appropriate error message when an invalid string value is given for the stop signal
279265
- Tests for verifying behavior when feature gate is disabled after being used to create Pods where the stop signal field is used
280266
- Tests for verifying behavior when feature gate is reenabled after being disabled after creating Pods with stop signal
281-
282-
##### Integration tests
283267

284268
##### e2e tests
285269

@@ -318,27 +302,27 @@ Alpha:
318302

319303
#### Upgrade
320304

321-
When upgrading to a new Kubernetes version which supports Container Stop Signals, users can enable the feature gate and start using the feature. If the user is running an older version of the container runtime, the feature will be gracefully degraded as mentioned [here](https://www.kubernetes.dev/docs/code/cri-api-version-skew-policy/#version-skew-policy-for-cri-api) in the CRI API version skew doc. In this case the user will be able to set a Stop lifecycle hook in the Container spec, but the kubelet will not pass this value to the container runtime when calling the `runtimeService.stopContainer` method.
305+
When upgrading to a new Kubernetes version which supports Container Stop Signals, users can enable the feature gate and start using the feature. If the user is running an older version of the container runtime, the feature will be gracefully degraded as mentioned [here](https://www.kubernetes.dev/docs/code/cri-api-version-skew-policy/#version-skew-policy-for-cri-api) in the CRI API version skew doc. In this case the user will be able to set a StopSignal lifecycle hook in the Container spec, but the kubelet will not pass this value to the container runtime when calling the `runtimeService.stopContainer` method. The container status would also not have stop signal since the container runtime is not updated to return the effective stop signal extracted from the image.
322306

323307
#### Downgrade
324308

325-
If the kube-apiserver or the kubelet's version is downgraded, you will no longer be able to create or update container specs to include the Stop lifecycle hook. Existing containers which have the field set would not be cleared. If you're running a version of the kubelet which doesn't support ContainerStopSignals, the CRI API wouldn't pass the stop signal to the runtime as part of ContainerConfig. Even if the runtime is on a newer version supporting stop signal, it would handle this and default to the stop signal defined in the image or to SIGTERM.
309+
If the kube-apiserver or the kubelet's version is downgraded, you will no longer be able to create or update container specs to include the StopSignal lifecycle hook. Existing containers which have the field set would not be cleared. If you're running a version of the kubelet which doesn't support ContainerStopSignals, the CRI API wouldn't pass the stop signal to the runtime as part of ContainerConfig. Even if the container runtime is on a newer version supporting stop signal, it would handle this and default to the stop signal defined in the image or to SIGTERM.
326310

327311
### Version Skew Strategy
328312

329313
Both kubelet and kube-apiserver will need to enable the feature gate for the full featureset to be present and working. If both components disable the feature gate, this feature will be completely unavailable.
330314

331-
If only the kube-apiserver enables this feature, validation will pass, but kubelet won't understand the new lifecycle hook and will ignore it when creating the ContainerConfig.
315+
If only the kube-apiserver enables this feature, validation will pass, but kubelet won't understand the new lifecycle hook and will not add the stop signal when creating the ContainerConfig.
332316

333-
If only the kubelet has enabled this feature, you won't be able to create a Pod which has a Stop lifecycle hook via the apiserver and hence the feature won't be usable even if the kubelet supports it. `containerSpec.Lifecycle.Stop.Signal` can be an empty value and kubelet functions as if no custom stop signal has been set for any container.
317+
If only the kubelet has enabled this feature, you won't be able to create a Pod which has a StopSignal lifecycle hook via the apiserver and hence the feature won't be usable even if the kubelet supports it. `containerSpec.Lifecycle.StopSignal` can be an empty value and kubelet functions as if no custom stop signal has been set for any container.
334318

335319
#### Version skew with CRI API and container runtime
336320

337321
As described above in the upgrade/downgrade strategies,
338322

339323
- **If the container runtime is in an older version than kubelet**, the feature will be gracefully degraded. In this case the user will be able to set the stop signal in the Container spec, but the kubelet will not pass this value to the container runtime via ContainerConfig and the container runtime will use the stop signal defined in the image or use the default SIGTERM.
340324

341-
- **If you're running an older version of the kubelet with a newer version of the container runtime**, the CRI API call from the kubelet would be made with the older version of ContainerConfig which doesn't include the stop signal. The container runtime code, even if it is running the newer version supporting stop signal, would handle this and use the stop signal defined in the container image or default to SIGTERM.
325+
- **If you're running an older version of the kubelet with a newer version of the container runtime**, the CRI API call from the kubelet would be made with the older version of ContainerConfig which doesn't include the stop signal. The container runtime doesn't receive any custom stop signal from the container spec in this case. The container runtime code, even if it is running the newer version supporting stop signal, would fall back to the current behaviour and use the stop signal defined in the container image or default to SIGTERM since it doesn't receive any stop signal from ContainerSpec.
342326

343327
## Production Readiness Review Questionnaire
344328

@@ -366,7 +350,7 @@ Yes, the feature gate can be turned off to disable the feature once it has been
366350

367351
###### What happens if we reenable the feature if it was previously rolled back?
368352

369-
If you reenable the feature, you'll be able to create Pods with Stop lifecycle hooks for their containers. Without the feature gate enabled, this would make your workloads invalid.
353+
If you reenable the feature, you'll be able to create Pods with StopSignal lifecycle hooks for their containers. Without the feature gate enabled, this would make your workloads invalid.
370354

371355
###### Are there any tests for feature enablement/disablement?
372356

@@ -445,7 +429,7 @@ No.
445429

446430
###### Will enabling / using this feature result in increasing size or count of the existing API objects?
447431

448-
We are adding a new lifecycle hook called Stop, and a new lifecycle handler called Signal which can be used in the Container spec. These are optional values however and can increase the size of the API object.
432+
We are adding a new lifecycle hook called StopSignal, which takes a string value. These are optional values however and can increase the size of the API object.
449433

450434
###### Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs?
451435

0 commit comments

Comments
 (0)