Skip to content

Commit 1e64246

Browse files
authored
Merge pull request kubernetes#2892 from verb/1.23-ec-beta
KEP-277: Update for beta in 1.23
2 parents ac13425 + 1f77e8f commit 1e64246

File tree

3 files changed

+61
-93
lines changed

3 files changed

+61
-93
lines changed

keps/prod-readiness/sig-node/277.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
11
kep-number: 277
22
alpha:
33
approver: "@johnbelamaric"
4+
beta:
5+
approver: "@johnbelamaric"

keps/sig-node/277-ephemeral-containers/README.md

Lines changed: 53 additions & 89 deletions
Original file line numberDiff line numberDiff line change
@@ -10,13 +10,12 @@
1010
- [Non-Goals](#non-goals)
1111
- [Proposal](#proposal)
1212
- [Creating Ephemeral Containers](#creating-ephemeral-containers)
13-
- [Identifying Pods with Ephemeral Containers](#identifying-pods-with-ephemeral-containers)
14-
- [Reattaching and Restarting Ephemeral Containers](#reattaching-and-restarting-ephemeral-containers)
13+
- [Reattaching Ephemeral Containers](#reattaching-ephemeral-containers)
14+
- [Ephemeral Container Lifecycle](#ephemeral-container-lifecycle)
15+
- [Removing Ephemeral Containers](#removing-ephemeral-containers)
1516
- [Configurable Security Policy](#configurable-security-policy)
1617
- [Specifying Security Context](#specifying-security-context)
1718
- [Compatibility with existing Admission Controllers](#compatibility-with-existing-admission-controllers)
18-
- [Killing Ephemeral Containers](#killing-ephemeral-containers)
19-
- [Removing and Re-adding Ephemeral Containers](#removing-and-re-adding-ephemeral-containers)
2019
- [User Stories](#user-stories)
2120
- [Operations](#operations)
2221
- [Debugging](#debugging)
@@ -221,28 +220,31 @@ There are no limits on the number of Ephemeral Containers that can be created in
221220
a pod, but exceeding a pod's resource allocation may cause the pod to be
222221
evicted.
223222

224-
### Identifying Pods with Ephemeral Containers
223+
### Reattaching Ephemeral Containers
225224

226-
The kubelet will set a `PodCondition` when it starts an Ephemeral Container.
227-
This condition may not be cleared: it will exist for the lifetime of the Pod
228-
and continues to exist even if all Ephemeral Containers are removed.
225+
One may reattach to a Ephemeral Container using `kubectl attach`. When supported
226+
by a runtime, multiple clients can attach to a single debug container and share
227+
the terminal. This is supported by the Docker runtime.
229228

230-
The intended use of this `PodCondition` is to enable administrators to enforce
231-
custom policies for pods that have had Ephemeral Containers. For example,
232-
cluster administrators may want to automatically apply a label or delete the pod
233-
after a configurable time. This may be accomplished by a controller watching
234-
for this `PodCondition`, though the implementation of such a controller is out
235-
of scope for this proposal.
229+
### Ephemeral Container Lifecycle
236230

237-
### Reattaching and Restarting Ephemeral Containers
231+
Ephemeral containers will stop when their command exits, such as exiting a
232+
shell, and they will not be restarted. Unlike `kubectl exec`, processes in
233+
Ephemeral Containers will not receive an EOF if their connections are
234+
interrupted, so shells won't automatically exit on disconnect.
238235

239-
One can reattach to a Ephemeral Container using `kubectl attach`. When supported
240-
by a runtime, multiple clients can attach to a single debug container and share
241-
the terminal. This is supported by Docker.
236+
There is no API support for killing or restarting an ephemeral container.
237+
The only way to exit the container is to send it an OS signal.
242238

243-
Ephemeral Containers will not be restarted automatically, and there is no
244-
method in the API to restart an Ephemeral Container. Creators of Ephemeral
245-
Containers are expected to choose a new, unused name.
239+
### Removing Ephemeral Containers
240+
241+
Ephemeral containers may not be removed from a Pod once added, but
242+
we've received feedback during the alpha period that users would like
243+
the possibility of removing ephemeral containers (see
244+
[#84764](https://issues.k8s.io/84764)).
245+
246+
Removal is out of scope for the initial graduation of ephemeral containers,
247+
but it may be added by a future KEP.
246248

247249
### Configurable Security Policy
248250

@@ -307,50 +309,6 @@ administrators should ensure that their admission controllers support ephemeral
307309
containers prior to upgrading and provide instructions for how to disable
308310
ephemeral container creation in a cluster.
309311

310-
### Killing Ephemeral Containers
311-
312-
Ephemeral Containers will stop when their command exits, such as exiting a
313-
shell, and they will not be restarted. Unlike `kubectl exec`, processes in
314-
Ephemeral Containers will not receive an EOF if their connections are
315-
interrupted, so shells won't automatically exit on disconnect. Without the
316-
ability to remove an Ephemeral Container via the API, the only way to exit the
317-
container is to send it an OS signal.
318-
319-
Killing an Ephemeral Container is supported by removing it from the list of
320-
Ephemeral Containers in the Pod spec. The kubelet will then kill the container
321-
and cease publishing a `ContainerStatus` for this container.
322-
323-
#### Removing and Re-adding Ephemeral Containers
324-
325-
An edge case worth considering is what happens when a user removes and re-adds
326-
a Ephemeral Container with the same name. This presents a synchronization
327-
problem not present in the immutable container lists, which we resolve by
328-
enforcing the following constraints:
329-
330-
- The client MUST NOT add an Ephemeral Container with the same name as
331-
a container listed in `Pod.Status.EphemeralContainerStatuses`. This is an
332-
error and will be rejected by the API server.
333-
- The kubelet MAY continue publishing an `EphemeralContainerStatus` for
334-
an Ephemeral Container that no longer appears in
335-
`Pod.Spec.EphemeralContainers`.
336-
- The kubelet MUST start a new container for any container in
337-
`Pod.Spec.EphemeralContainers` that does not also appear in
338-
`Pod.Status.EphemeralContainerStatuses.
339-
340-
In this way the kubelet is able to signal to clients the set of container names
341-
which are unavailable because they correspond to containers still running on
342-
the node. The procedure for replacing a container then becomes:
343-
344-
1. A client removes an Ephemeral Container from `Pod.Spec.EphemeralContainers`.
345-
2. The client waits for the Ephemeral Container to be removed from
346-
`Pod.Status.EphemeralContainerStatuses`.
347-
3. The client adds an Ephemeral Container with the same name to
348-
`Pod.Spec.EphemeralContainers`.
349-
350-
This is not recommended, however, because the kubelet is under no obligation to
351-
remove the Ephemeral Container from `EphemeralContainerStatuses` in a timely
352-
fashion. Clients should choose a new container name instead.
353-
354312
### User Stories
355313

356314
#### Operations
@@ -884,33 +842,28 @@ _This section must be completed when targeting beta graduation to a release._
884842

885843
* **How can an operator determine if the feature is in use by workloads?**
886844

887-
We will create a new gauge metric that's updated during kubelet's reconcile
888-
of `v1.Pod` to track the number containers scheduled to this node in the API.
889-
This will be slightly different than the existing
890-
`kubelet_running_containers`, which describes the kubelet's representation of
891-
containers, and will be able to label the metrics with fields that are only
892-
available in the API object, such as type of container.
893-
894-
Note that these kubelet metrics are still in alpha.
845+
This information is available by examining pod objects in the API server
846+
for the field `pod.spec.ephemeralContainers`. Additionally, the kubelet surfaces
847+
the following metrics, added in [#99000](https://issues.k8s.io/99000):
895848

896-
This is tracked in [#97974](https://issues.k8s.io/97974).
849+
- `kubelet_managed_ephemeral_containers`: The number of ephemeral containers
850+
in pods managed by this kubelet.
851+
- `kubelet_started_containers_total`: Counter of all containers started by
852+
this kubelet, indexed by `container_type`. Ephemeral containers have a
853+
`container_type` of `ephemeral_container`.
854+
- `kubelet_started_containers_errors_total `: Counter of errors encountered
855+
when this kubelet starts containers, idnexed by `container_type`.
856+
Ephemeral containers have a `container_type` of `ephemeral_container`.
897857

898858
* **What are the SLIs (Service Level Indicators) an operator can use to determine
899859
the health of the service?**
900860
- [x] Metrics
901-
- Metric name: `apiserver_request_total{component="apiserver",resource="pods",subresource="ephemeralcontainers"}` (apiserver), `kubelet_container_errors_total{type="Ephemeral"}` (kubelet, Proposed)
861+
- Metric name: `apiserver_request_total{component="apiserver",resource="pods",subresource="ephemeralcontainers"}` (apiserver), `kubelet_started_containers_errors_total{container_type="ephemeral_container"}`
902862
- [Optional] Aggregation method: Aggregate by container type
903-
- Components exposing the metric: kubelet
863+
- Components exposing the metric: apiserver, kubelet
904864
- [ ] Other (treat as last resort)
905865
- Details:
906866

907-
Note that the kubelet SLI for this feature is a counter that increments upon
908-
failure to create an ephemeral container. Right now the kubelet only surfaces
909-
runtime-level errors, so I'll propose adding a higher level counter to
910-
encapsulate the entire container creation request, including container type.
911-
912-
This is tracked in [#97974](https://issues.k8s.io/97974).
913-
914867
* **What are the reasonable SLOs (Service Level Objectives) for the above SLIs?**
915868
At a high level, this usually will be in the form of "high percentile of SLI
916869
per day <= X". It's impossible to provide comprehensive guidance, but at the very
@@ -962,11 +915,13 @@ previous answers based on experience in the field._
962915

963916
* **Will enabling / using this feature result in introducing new API types?**
964917

965-
There an no new Kinds for storage, but new types are used in API interactions
966-
and in `v1.Pod`.
918+
There an no new Kinds for storage, but new types are used in `v1.Pod`.
919+
Ephemeral containers are added by writing a `v1.Pod` containing
920+
`pod.spec.ephemeralContainers` to the pod's `/ephemeralcontainers`
921+
subresource, similar to how the kubelet updates `pod.status`.
967922

968923
- API type:
969-
- v1.EphemeralContainers (used for `/ephemeralcontainers` subresource)
924+
- v1.Pod (with `/ephemeralcontainers` subresource)
970925
- Supported number of objects per cluster: same as Pods
971926
- Supported number of objects per namespace: same as Pods
972927

@@ -980,21 +935,22 @@ the existing API objects?**
980935

981936
- API type(s): v1.Pod
982937
- Estimated increase in size: Additional `Container` for each Ephemeral
983-
container. This is expected to be negligible since these are created by
938+
container. This is expected to be negligible since these are created
984939
manually by humans.
985940
- Estimated amount of new objects: N/A
986941

987942
* **Will enabling / using this feature result in increasing time taken by any
988943
operations covered by [existing SLIs/SLOs]?**
989944

990-
When people add additional containers to a Pod, the pod will have additional
945+
When users add additional containers to a Pod, the pod will have additional
991946
containers to shut down and garbage collect when the Pod exits.
992947

993948
* **Will enabling / using this feature result in non-negligible increase of
994949
resource usage (CPU, RAM, disk, IO, ...) in any components?**
995950

996951
Not automatically. Use of this feature will result in additional containers
997-
running on kubelets.
952+
running on kubelets, but it does not change the amount of resources allocated
953+
to pods.
998954

999955
### Troubleshooting
1000956

@@ -1030,6 +986,11 @@ _This section must be completed when targeting beta graduation to a release._
1030986
- Testing: No, testing for cluster misconfiguration at dev time doesn't
1031987
prevent cluster misconfiguration at run time.
1032988

989+
One may completely disable the feature using the `EphemeralContainers` feature
990+
flag, but it's also possible to prevent the creation of new ephemeral containers
991+
without a restart by removing authorization to `ephemeralcontainers` subresource
992+
via [RBAC](https://kubernetes.io/docs/reference/access-authn-authz/rbac/).
993+
1033994
* **What steps should be taken if SLOs are not being met to determine the problem?**
1034995

1035996
Troubleshoot using apiserver and kubelet error logs.
@@ -1050,6 +1011,9 @@ _This section must be completed when targeting beta graduation to a release._
10501011
- *2020-09-29*: Ported KEP to directory-based template.
10511012
- *2021-01-07*: Updated KEP for beta release in 1.21 and completed PRR section.
10521013
- *2021-04-12*: Switched `/ephemeralcontainers` API to use `Pod`.
1014+
- *2021-05-14*: Add additional graduation criteria
1015+
- *2021-07-09*: Revert KEP to alpha because of the new API introduced in 1.22.
1016+
- *2021-08-23*: Updated KEP for beta release in 1.23.
10531017

10541018
## Drawbacks
10551019

keps/sig-node/277-ephemeral-containers/kep.yaml

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ participating-sigs:
99
status: implementable
1010
creation-date: 2019-02-12
1111
reviewers:
12-
- "@yujuhong"
12+
- "@dchen1107"
1313
approvers:
1414
- "@dchen1107"
1515
- "@liggitt"
@@ -19,12 +19,12 @@ see-also:
1919
- "/keps/sig-cli/1441-kubectl-debug"
2020

2121
# The target maturity stage in the current dev cycle for this KEP.
22-
stage: alpha
22+
stage: beta
2323

2424
# The most recent milestone for which work toward delivery of this KEP has been
2525
# done. This can be the current (upcoming) milestone, if it is being actively
2626
# worked on.
27-
latest-milestone: "v1.22"
27+
latest-milestone: "v1.23"
2828

2929
# The milestone at which this feature was, or is targeted to be, at each stage.
3030
milestone:
@@ -43,4 +43,6 @@ disable-supported: true
4343

4444
# The following PRR answers are required at beta release
4545
metrics:
46-
- kubelet_container_errors_total
46+
- kubelet_started_containers_total
47+
- kubelet_started_containers_errors_total
48+
- kubelet_managed_ephemeral_containers

0 commit comments

Comments
 (0)