@@ -132,7 +132,7 @@ yes, it is enough to restart the node on cgroup v1
132
132
133
133
###### What happens if we reenable the feature if it was previously rolled back?
134
134
135
- it should work seamlessly without any difference
135
+ It should work seamlessly without any difference
136
136
137
137
###### Are there any tests for feature enablement/disablement?
138
138
@@ -149,7 +149,8 @@ cgroup file system, then also the workload must be enabled for cgroup v2.
149
149
150
150
###### What specific metrics should inform a rollback?
151
151
152
- Pods not being healthy.
152
+ Pods not being healthy. One could inspect if the pods are getting the cgroups
153
+ set correctly referencing the conversion table in this KEP.
153
154
154
155
###### Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?
155
156
@@ -163,8 +164,8 @@ The cgroup file system inside of the containers will use cgroup v2 instead of cg
163
164
164
165
###### How can an operator determine if the feature is in use by workloads?
165
166
166
- Looking at the node configuration. If the node is using cgroup v2, then also the pods
167
- running on that node are using it.
167
+ An operator could run ` cat /proc/self/cgroup ` on a node to check if it is running in cgroups v2 mode.
168
+ If the node is using cgroup v2, then also the pods running on that node are using it.
168
169
169
170
###### How can someone using this feature know that it is working for their instance?
170
171
@@ -192,7 +193,7 @@ N/A. Same as when running on cgroup v1.
192
193
193
194
###### Are there any missing metrics that would be useful to have to improve observability of this feature?
194
195
195
- N.
196
+ No
196
197
197
198
### Dependencies
198
199
240
241
241
242
###### What steps should be taken if SLOs are not being met to determine the problem?
242
243
243
- Reboot the node on cgroup v1
244
+ If SLOs are not being met, reboot the node in cgroup v1 to disable this feature.
244
245
245
246
## Proposal
246
247
0 commit comments