Skip to content

Commit 56745e3

Browse files
committed
deprecate and remove Kubelet RunOnce mode
1 parent 48d6c58 commit 56745e3

File tree

3 files changed

+404
-0
lines changed

3 files changed

+404
-0
lines changed
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
kep-number: 4580
2+
alpha:
3+
approver: "@soltysh"
Lines changed: 359 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,359 @@
1+
<!--
2+
**Note:** When your KEP is complete, all of these comment blocks should be removed.
3+
4+
To get started with this template:
5+
6+
- [ ] **Pick a hosting SIG.**
7+
Make sure that the problem space is something the SIG is interested in taking
8+
up. KEPs should not be checked in without a sponsoring SIG.
9+
- [ ] **Create an issue in kubernetes/enhancements**
10+
When filing an enhancement tracking issue, please make sure to complete all
11+
fields in that template. One of the fields asks for a link to the KEP. You
12+
can leave that blank until this KEP is filed, and then go back to the
13+
enhancement and add the link.
14+
- [ ] **Make a copy of this template directory.**
15+
Copy this template into the owning SIG's directory and name it
16+
`NNNN-short-descriptive-title`, where `NNNN` is the issue number (with no
17+
leading-zero padding) assigned to your enhancement above.
18+
- [ ] **Fill out as much of the kep.yaml file as you can.**
19+
At minimum, you should fill in the "Title", "Authors", "Owning-sig",
20+
"Status", and date-related fields.
21+
- [ ] **Fill out this file as best you can.**
22+
At minimum, you should fill in the "Summary" and "Motivation" sections.
23+
These should be easy if you've preflighted the idea of the KEP with the
24+
appropriate SIG(s).
25+
- [ ] **Create a PR for this KEP.**
26+
Assign it to people in the SIG who are sponsoring this process.
27+
- [ ] **Merge early and iterate.**
28+
Avoid getting hung up on specific details and instead aim to get the goals of
29+
the KEP clarified and merged quickly. The best way to do this is to just
30+
start with the high-level sections and fill out details incrementally in
31+
subsequent PRs.
32+
33+
Just because a KEP is merged does not mean it is complete or approved. Any KEP
34+
marked as `provisional` is a working document and subject to change. You can
35+
denote sections that are under active debate as follows:
36+
37+
```
38+
<<[UNRESOLVED optional short context or usernames ]>>
39+
Stuff that is being argued.
40+
<<[/UNRESOLVED]>>
41+
```
42+
43+
When editing KEPS, aim for tightly-scoped, single-topic PRs to keep discussions
44+
focused. If you disagree with what is already in a document, open a new PR
45+
with suggested changes.
46+
47+
One KEP corresponds to one "feature" or "enhancement" for its whole lifecycle.
48+
You do not need a new KEP to move from beta to GA, for example. If
49+
new details emerge that belong in the KEP, edit the KEP. Once a feature has become
50+
"implemented", major changes should get new KEPs.
51+
52+
The canonical place for the latest set of instructions (and the likely source
53+
of this file) is [here](/keps/NNNN-kep-template/README.md).
54+
55+
**Note:** Any PRs to move a KEP to `implementable`, or significant changes once
56+
it is marked `implementable`, must be approved by each of the KEP approvers.
57+
If none of those approvers are still appropriate, then changes to that list
58+
should be approved by the remaining approvers and/or the owning SIG (or
59+
SIG Architecture for cross-cutting KEPs).
60+
-->
61+
# KEP-4580: Deprecate & remove Kubelet RunOnce mode
62+
63+
<!--
64+
A table of contents is helpful for quickly jumping to sections of a KEP and for
65+
highlighting any additional information provided beyond the standard KEP
66+
template.
67+
68+
Ensure the TOC is wrapped with
69+
<code>&lt;!-- toc --&rt;&lt;!-- /toc --&rt;</code>
70+
tags, and then generate with `hack/update-toc.sh`.
71+
-->
72+
73+
<!-- toc -->
74+
- [Release Signoff Checklist](#release-signoff-checklist)
75+
- [Summary](#summary)
76+
- [Motivation](#motivation)
77+
- [Goals](#goals)
78+
- [Non-Goals](#non-goals)
79+
- [Proposal](#proposal)
80+
- [Risks and Mitigations](#risks-and-mitigations)
81+
- [Design Details](#design-details)
82+
- [KubeletConfiguration Change: KubeletConfiguration](#kubeletconfiguration-change-kubeletconfiguration)
83+
- [kubelet flag Change](#kubelet-flag-change)
84+
- [Implement warning logging for RunOnce mode usage](#implement-warning-logging-for-runonce-mode-usage)
85+
- [Introduction LegacyNodeRunOnceMode feature gate](#introduction-legacynoderunoncemode-feature-gate)
86+
- [Test Plan](#test-plan)
87+
- [Prerequisite testing updates](#prerequisite-testing-updates)
88+
- [Unit tests](#unit-tests)
89+
- [Integration tests](#integration-tests)
90+
- [e2e tests](#e2e-tests)
91+
- [Graduation Criteria](#graduation-criteria)
92+
- [Alpha](#alpha)
93+
- [Beta](#beta)
94+
- [GA](#ga)
95+
- [Upgrade / Downgrade Strategy](#upgrade--downgrade-strategy)
96+
- [Version Skew Strategy](#version-skew-strategy)
97+
- [Production Readiness Review Questionnaire](#production-readiness-review-questionnaire)
98+
- [Feature Enablement and Rollback](#feature-enablement-and-rollback)
99+
- [Rollout, Upgrade and Rollback Planning](#rollout-upgrade-and-rollback-planning)
100+
- [Monitoring Requirements](#monitoring-requirements)
101+
- [Dependencies](#dependencies)
102+
- [Scalability](#scalability)
103+
- [Troubleshooting](#troubleshooting)
104+
- [Implementation History](#implementation-history)
105+
- [Drawbacks](#drawbacks)
106+
- [Alternatives](#alternatives)
107+
- [Infrastructure Needed (Optional)](#infrastructure-needed-optional)
108+
<!-- /toc -->
109+
110+
## Release Signoff Checklist
111+
112+
<!--
113+
**ACTION REQUIRED:** In order to merge code into a release, there must be an
114+
issue in [kubernetes/enhancements] referencing this KEP and targeting a release
115+
milestone **before the [Enhancement Freeze](https://git.k8s.io/sig-release/releases)
116+
of the targeted release**.
117+
118+
For enhancements that make changes to code or processes/procedures in core
119+
Kubernetes—i.e., [kubernetes/kubernetes], we require the following Release
120+
Signoff checklist to be completed.
121+
122+
Check these off as they are completed for the Release Team to track. These
123+
checklist items _must_ be updated for the enhancement to be released.
124+
-->
125+
126+
Items marked with (R) are required *prior to targeting to a milestone / release*.
127+
128+
- [x] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial KEP PR)
129+
- [x] (R) KEP approvers have approved the KEP status as `implementable`
130+
- [x] (R) Design details are appropriately documented
131+
- [ ] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors)
132+
- [ ] e2e Tests for all Beta API Operations (endpoints)
133+
- [ ] (R) Ensure GA e2e tests meet requirements for [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
134+
- [ ] (R) Minimum Two Week Window for GA e2e tests to prove flake free
135+
- [ ] (R) Graduation criteria is in place
136+
- [ ] (R) [all GA Endpoints](https://github.com/kubernetes/community/pull/1806) must be hit by [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
137+
- [x] (R) Production readiness review completed
138+
- [x] (R) Production readiness review approved
139+
- [ ] "Implementation History" section is up-to-date for milestone
140+
- [ ] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io]
141+
- [x] Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes
142+
143+
<!--
144+
**Note:** This checklist is iterative and should be reviewed and updated every time this enhancement is being considered for a milestone.
145+
-->
146+
147+
[kubernetes.io]: https://kubernetes.io/
148+
[kubernetes/enhancements]: https://git.k8s.io/enhancements
149+
[kubernetes/kubernetes]: https://git.k8s.io/kubernetes
150+
[kubernetes/website]: https://git.k8s.io/website
151+
152+
## Summary
153+
154+
Deprecate and remove kubelet support for RunOnce mode, and mark the `RunOnce` field in `KubeletConfiguration` and the `--runonce` flag of kubelet as deprecated, finally remove the `--runonce` flag.
155+
156+
## Motivation
157+
158+
* RunOnce does not work in systemd mode.
159+
* RunOnce mode doesn't support many newer pod features (init containers).
160+
* RunOnce mode does not apply to the pod lifecycle we describe in the documentation, e.g. it does not support any volumes.
161+
* RunOnce only provides some unit tests, without any e2e or integration tests, which makes us unable to guarantee whether it is usable.
162+
163+
### Goals
164+
165+
* Mark the `RunOnce` field in `KubeletConfiguration` and the `runonce` flag of kubelet as deprecated, and finally remove the `runonce` flag.
166+
* Remove kubelet support for RunOnce mode.
167+
168+
### Non-Goals
169+
170+
Immediate removal: the deprecation and removal process will be gradual and feature gate to increase awareness among potential users.
171+
172+
## Proposal
173+
174+
The RunOnce mode of kubelet will exit the kubelet process after spawning pods from the local manifests or remote URL. It is suitable for scenarios where one-time tasks need to be run on the node, this proposal outlines plans to deprecate and remove RunOnce mode in kubelet.
175+
176+
### Risks and Mitigations
177+
178+
Some people may still rely on this feature, but podman addresses the same use case with more well-supported way, ref: https://docs.podman.io/en/latest/markdown/podman-kube.1.html. Affected users can migrate to *podman kube subcommand* on demand.
179+
180+
For Docker users, Docker does not officially provide a subcommand similar to podman-kube-play to create containers with Kubernetes YAML, and there is currently no mature and reliable third-party tool to translate Kubernetes YAML into Docker Compose files, but they can manually perform this process and run containers in the form of Docker Compose.
181+
182+
## Design Details
183+
184+
### KubeletConfiguration Change: KubeletConfiguration
185+
186+
Mark the `RunOnce` field as deprecated.
187+
188+
### kubelet flag Change
189+
190+
make the `--runonce` flag as deprecated, and remove it in GA version.
191+
192+
### Implement warning logging for RunOnce mode usage
193+
194+
Starting in 1.31, during kubelet startup, if running in RunOnce mode, the kubelet will log a warning message, for example:
195+
196+
```
197+
klog.Warning("RunOnce mode has been deprecated, and will be removed in a future release")
198+
```
199+
200+
### Introduction LegacyNodeRunOnceMode feature gate
201+
202+
With the introduction of the `LegacyNodeRunOnceMode` feature gate, Kubernetes aims to guide users through the deprecated RunOnce mode. Unless this feature gate is enabled, kubelet will refuse to start when the `--runonce` command line flag is set.
203+
204+
### Test Plan
205+
206+
[x] I/we understand the owners of the involved components may require updates to
207+
existing tests to make this code solid enough prior to committing the changes necessary
208+
to implement this enhancement.
209+
210+
##### Prerequisite testing updates
211+
212+
##### Unit tests
213+
214+
- N/A
215+
216+
##### Integration tests
217+
218+
- N/A
219+
220+
##### e2e tests
221+
222+
- N/A
223+
224+
### Graduation Criteria
225+
226+
#### Alpha
227+
228+
- Feature gate `LegacyNodeRunOnceMode` is introduced, is enable by default. Disable this feature gate will fail the kubelet on startup with RunOnce mode enable.
229+
- Mark the `RunOnce` field in `KubeletConfiguration` as deprecated.
230+
231+
#### Beta
232+
233+
- `LegacyNodeRunOnceMode` feature gate is disable by default.
234+
- Failed when starting kubelet in RunOnce mode.
235+
236+
#### GA
237+
238+
- We make the `LegacyNodeRunOnceMode` feature gate disable by default and cannot be enable.
239+
- Comment the `RunOnce` field in KubeletConfiguration as 'no longer has any effect', and remove the kubelet's `--runonce` flag.
240+
- Remove kubelet RunOnce mode.
241+
242+
### Upgrade / Downgrade Strategy
243+
244+
### Version Skew Strategy
245+
246+
- N/A
247+
248+
## Production Readiness Review Questionnaire
249+
250+
### Feature Enablement and Rollback
251+
252+
- [x] Feature gate (also fill in values in `kep.yaml`)
253+
- Feature gate name: LegacyNodeRunOnceMode
254+
- Components depending on the feature gate: kubelet
255+
- Will enabling / disabling the feature require downtime of the control
256+
plane? Yes. Flag must be set on kubelet start. To disable, kubelet must be restarted. Hence, there would be brief control component downtime on a given node.
257+
- Will enabling / disabling the feature require downtime or reprovisioning
258+
of a node? Yes. See above; disabling would require brief node downtime.
259+
260+
###### Does enabling the feature change any default behavior?
261+
262+
No.
263+
264+
###### Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)?
265+
266+
Yes. Using the feature gate is the only way to enable/disable this feature.
267+
268+
###### What happens if we reenable the feature if it was previously rolled back?
269+
270+
Re-enabling the feature will make the RunOnce functionality available in the kubelet.
271+
272+
###### Are there any tests for feature enablement/disablement?
273+
274+
N/A
275+
276+
### Rollout, Upgrade and Rollback Planning
277+
278+
###### How can a rollout or rollback fail? Can it impact already running workloads?
279+
280+
In the alpha `stage`, this feature is enable by default.
281+
282+
Cluster operators can test the behavior by enabling the feature gate.
283+
284+
In the beta `stage`, this feature is enable by default. With this feature disabled, the kubelet will refuse to start if is still using RunOnce mode.
285+
286+
Cluster operators can reinstate the mode by explicitly enabling the feature gate.
287+
288+
###### What specific metrics should inform a rollback?
289+
290+
N/A
291+
292+
###### Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?
293+
294+
N/A
295+
296+
###### Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.?
297+
298+
We will deprecate and remove the `--runonce` flag of kubelet and the `RunOnce` field in `KubeletConfiguration`.
299+
300+
### Monitoring Requirements
301+
302+
* N/A
303+
304+
### Dependencies
305+
306+
###### Does this feature depend on any specific services running in the cluster?
307+
308+
No
309+
310+
### Scalability
311+
312+
###### Will enabling / using this feature result in any new API calls?
313+
314+
No
315+
316+
###### Will enabling / using this feature result in introducing new API types?
317+
318+
No
319+
320+
###### Will enabling / using this feature result in any new calls to the cloud provider?
321+
322+
No
323+
324+
###### Will enabling / using this feature result in increasing size or count of the existing API objects?
325+
326+
No
327+
328+
###### Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs?
329+
330+
No
331+
332+
###### Will enabling / using this feature result in non-negligible increase of resource usage (CPU, RAM, disk, IO, ...) in any components?
333+
334+
No
335+
336+
###### Can enabling / using this feature result in resource exhaustion of some node resources (PIDs, sockets, inodes, etc.)?
337+
338+
No
339+
340+
### Troubleshooting
341+
342+
###### How does this feature react if the API server and/or etcd is unavailable?
343+
344+
###### What are other known failure modes?
345+
346+
###### What steps should be taken if SLOs are not being met to determine the problem?
347+
348+
## Implementation History
349+
350+
- \- 2024-04-17: Initial draft KEP
351+
352+
## Drawbacks
353+
354+
## Alternatives
355+
356+
* Fix RunOnce mode and add e2e tests and integration tests.
357+
* Make RunOnce mode work in systemd mode and support volumes.
358+
359+
## Infrastructure Needed (Optional)

0 commit comments

Comments
 (0)