Skip to content

Commit 7cce4a9

Browse files
authored
Merge pull request kubernetes#2740 from mrunalp/cgroups_v2_fixes
Cgroups v2: Update latest milestone
2 parents e744ede + cf61166 commit 7cce4a9

File tree

3 files changed

+198
-18
lines changed

3 files changed

+198
-18
lines changed
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
kep-number: 2254
2+
beta:
3+
approver: "@johnbelamaric"

keps/sig-node/2254-cgroup-v2/README.md

Lines changed: 192 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,18 @@
99
- [Non-Goals](#non-goals)
1010
- [User Stories](#user-stories)
1111
- [Implementation Details](#implementation-details)
12+
- [Design](#design)
13+
- [Test Plan](#test-plan)
14+
- [Needed Tests](#needed-tests)
15+
- [Graduation Criteria](#graduation-criteria)
16+
- [Upgrade / Downgrade Strategy](#upgrade--downgrade-strategy)
17+
- [Production Readiness Review Questionnaire](#production-readiness-review-questionnaire)
18+
- [Feature Enablement and Rollback](#feature-enablement-and-rollback)
19+
- [Rollout, Upgrade and Rollback Planning](#rollout-upgrade-and-rollback-planning)
20+
- [Monitoring Requirements](#monitoring-requirements)
21+
- [Dependencies](#dependencies)
22+
- [Scalability](#scalability)
23+
- [Troubleshooting](#troubleshooting)
1224
- [Proposal](#proposal)
1325
- [Dependencies on OCI and container runtimes](#dependencies-on-oci-and-container-runtimes)
1426
- [Current status of dependencies](#current-status-of-dependencies)
@@ -17,7 +29,6 @@
1729
- [Phase 1: Convert from cgroups v1 settings to v2](#phase-1-convert-from-cgroups-v1-settings-to-v2)
1830
- [Phase 2: Use cgroups v2 throughout the stack](#phase-2-use-cgroups-v2-throughout-the-stack)
1931
- [Risk and Mitigations](#risk-and-mitigations)
20-
- [Graduation Criteria](#graduation-criteria)
2132
<!-- /toc -->
2233

2334
## Summary
@@ -52,6 +63,186 @@ This proposal aims to:
5263

5364
## Implementation Details
5465

66+
## Design
67+
68+
### Test Plan
69+
70+
#### Needed Tests
71+
72+
- Run E2E tests on a cgroup v2 enabled host.
73+
74+
### Graduation Criteria
75+
76+
- Alpha: Phase 1 completed and basic support for running Kubernetes on
77+
a cgroups v2 host, e2e tests coverage or have a plan for the
78+
failing tests.
79+
A good candidate for running cgroup v2 test is Fedora 31 that has
80+
already switched to default to cgroup v2.
81+
82+
- Beta: e2e tests coverage and performance testing. Verify that both
83+
the CPU and Memory Manager work.
84+
85+
- GA: Assuming no negative user feedback based on production
86+
experience, promote after 2 releases in beta.
87+
*TBD* whether phase 2 must be implemented for GA.
88+
89+
### Upgrade / Downgrade Strategy
90+
91+
<!--
92+
If applicable, how will the component be upgraded and downgraded? Make sure
93+
this is in the test plan.
94+
95+
Consider the following in developing an upgrade/downgrade strategy for this
96+
enhancement:
97+
- What changes (in invocations, configurations, API use, etc.) is an existing
98+
cluster required to make on upgrade, in order to maintain previous behavior?
99+
- What changes (in invocations, configurations, API use, etc.) is an existing
100+
cluster required to make on upgrade, in order to make use of the enhancement?
101+
-->
102+
103+
N/A. Not relevant to upgrades. If the host is running with cgroup v2 then
104+
it will be automatically detected and used.
105+
106+
## Production Readiness Review Questionnaire
107+
108+
### Feature Enablement and Rollback
109+
110+
###### How can this feature be enabled / disabled in a live cluster?
111+
112+
- [ ] Feature gate (also fill in values in `kep.yaml`)
113+
- Feature gate name:
114+
- Components depending on the feature gate:
115+
- [X] Other
116+
- Describe the mechanism:
117+
configure the hosts to use cgroup v2
118+
- Will enabling / disabling the feature require downtime of the control
119+
plane?
120+
No, each host can be restarted to cgroup v2 separately
121+
- Will enabling / disabling the feature require downtime or reprovisioning
122+
of a node? (Do not assume `Dynamic Kubelet Config` feature is enabled).
123+
It requires downtime of a node since it needs to be rebooted
124+
125+
###### Does enabling the feature change any default behavior?
126+
127+
N/A. It must work in the same way as on cgroup v1
128+
129+
###### Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)?
130+
131+
yes, it is enough to restart the node on cgroup v1
132+
133+
###### What happens if we reenable the feature if it was previously rolled back?
134+
135+
It should work seamlessly without any difference
136+
137+
###### Are there any tests for feature enablement/disablement?
138+
139+
The same E2E tests that work on cgroup v1 should work on cgroup v2
140+
141+
### Rollout, Upgrade and Rollback Planning
142+
143+
N/A. Each node can be configured separately.
144+
145+
###### How can a rollout or rollback fail? Can it impact already running workloads?
146+
147+
N/A. It requires a reboot to be enabled. If the workload accesses directly the
148+
cgroup file system, then also the workload must be enabled for cgroup v2.
149+
150+
###### What specific metrics should inform a rollback?
151+
152+
Pods not being healthy. One could inspect if the pods are getting the cgroups
153+
set correctly referencing the conversion table in this KEP.
154+
155+
###### Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?
156+
157+
N/A. It depends on the node configuration and it is stateless.
158+
159+
###### Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.?
160+
161+
The cgroup file system inside of the containers will use cgroup v2 instead of cgroup v1.
162+
163+
### Monitoring Requirements
164+
165+
###### How can an operator determine if the feature is in use by workloads?
166+
167+
An operator could run `cat /proc/self/cgroup` on a node to check if it is running in cgroups v2 mode.
168+
If the node is using cgroup v2, then also the pods running on that node are using it.
169+
170+
###### How can someone using this feature know that it is working for their instance?
171+
172+
173+
- [ ] Events
174+
- Event Reason:
175+
- [ ] API .status
176+
- Condition name:
177+
- Other field:
178+
- [X] Other (treat as last resort)
179+
- Details: pods are healthy.
180+
181+
###### What are the reasonable SLOs (Service Level Objectives) for the enhancement?
182+
183+
N/A. Same as when running on cgroup v1.
184+
185+
###### What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service?
186+
187+
- [ ] Metrics
188+
- Metric name:
189+
- [Optional] Aggregation method:
190+
- Components exposing the metric:
191+
- [X] Other (treat as last resort)
192+
- Details: not a service
193+
194+
###### Are there any missing metrics that would be useful to have to improve observability of this feature?
195+
196+
No
197+
198+
### Dependencies
199+
200+
The container runtime must also support cgroup v2
201+
202+
###### Does this feature depend on any specific services running in the cluster?
203+
204+
No
205+
206+
### Scalability
207+
208+
###### Will enabling / using this feature result in any new API calls?
209+
210+
No
211+
212+
###### Will enabling / using this feature result in introducing new API types?
213+
214+
No
215+
216+
###### Will enabling / using this feature result in any new calls to the cloud provider?
217+
218+
No
219+
220+
###### Will enabling / using this feature result in increasing size or count of the existing API objects?
221+
222+
No
223+
224+
###### Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs?
225+
226+
No
227+
228+
###### Will enabling / using this feature result in non-negligible increase of resource usage (CPU, RAM, disk, IO, ...) in any components?
229+
230+
No
231+
232+
### Troubleshooting
233+
234+
###### How does this feature react if the API server and/or etcd is unavailable?
235+
236+
N/A
237+
238+
###### What are other known failure modes?
239+
240+
N/A
241+
242+
###### What steps should be taken if SLOs are not being met to determine the problem?
243+
244+
If SLOs are not being met, reboot the node in cgroup v1 to disable this feature.
245+
55246
## Proposal
56247

57248
The proposal is to implement cgroups v2 in two different phases.
@@ -201,17 +392,3 @@ Some cgroups v1 features are not available with cgroups v2:
201392
Some cgroups v1 controllers such as _device_ and _net_cls_,
202393
_net_prio_ are not available with the new version. The alternative to
203394
these controllers is to use eBPF.
204-
205-
## Graduation Criteria
206-
207-
- Alpha: Phase 1 completed and basic support for running Kubernetes on
208-
a cgroups v2 host, e2e tests coverage or have a plan for the
209-
failing tests.
210-
A good candidate for running cgroup v2 test is Fedora 31 that has
211-
already switched to default to cgroup v2.
212-
213-
- Beta: e2e tests coverage and performance testing.
214-
215-
- GA: Assuming no negative user feedback based on production
216-
experience, promote after 2 releases in beta.
217-
*TBD* whether phase 2 must be implemented for GA.

keps/sig-node/2254-cgroup-v2/kep.yaml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,6 @@ status: implementable
2020
see-also:
2121
replaces:
2222
superseded-by:
23-
24-
latest-milestone: "0.0"
25-
stage: "alpha"
23+
latest-milestone: "v1.22"
24+
stage: "beta"
25+
disable-supported: true

0 commit comments

Comments
 (0)