Skip to content

Commit 27ef0d9

Browse files
authored
Merge pull request kubernetes#4360 from samuelkarp/kep-4205
KEP-4025: several formatting fixes
2 parents f8489a1 + 0260e05 commit 27ef0d9

File tree

1 file changed

+31
-28
lines changed

1 file changed

+31
-28
lines changed

keps/sig-node/4205-psi-metric/README.md

Lines changed: 31 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -121,7 +121,7 @@ PSI metric will be available for users in the Kubernetes metrics API.
121121

122122
#### Story 2
123123

124-
Kubernetes users want to prevent new pods to be scheduled on the nodes that have resource starvation. By using PSI metric, the kubelet will set Node Condition to avoid pods being scheduled on nodes under high resource pressure. The node controller could then set a (taint on the node based on these new Node Conditions)[https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/#taint-nodes-by-condition].
124+
Kubernetes users want to prevent new pods to be scheduled on the nodes that have resource starvation. By using PSI metric, the kubelet will set Node Condition to avoid pods being scheduled on nodes under high resource pressure. The node controller could then set a [taint on the node based on these new Node Conditions](https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/#taint-nodes-by-condition).
125125

126126
### Risks and Mitigations
127127

@@ -137,20 +137,23 @@ default threshold to be used for reporting the nodes under heavy resource pressu
137137

138138
#### Phase 1
139139
1. Add new Data structures PSIData and PSIStats corresponding to the PSI metric output format as following:
140+
141+
```
140142
some avg10=0.00 avg60=0.00 avg300=0.00 total=0
141143
full avg10=0.00 avg60=0.00 avg300=0.00 total=0
144+
```
142145

143146
```go
144147
type PSIData struct {
145-
Avg10 *float64 `json:avg10`
146-
Avg60 *float64 `json:avg60`
147-
Avg300 *float64 `json:avg300`
148-
Total *float64 `json:total`
148+
Avg10 *float64 `json:"avg10"`
149+
Avg60 *float64 `json:"avg60"`
150+
Avg300 *float64 `json:"avg300"`
151+
Total *float64 `json:"total"`
149152
}
150153

151154
type PSIStats struct {
152-
Some *PSIData `json:some,omitempty`
153-
Full *PSIData `json:full,omitempty`
155+
Some *PSIData `json:"some,omitempty"`
156+
Full *PSIData `json:"full,omitempty"`
154157
}
155158
```
156159

@@ -161,16 +164,16 @@ metric data will be available through CRI instead.
161164
##### CPU
162165
```go
163166
type CPUStats struct {
164-
// PSI stats of the overall node
165-
PSI cadvisorapi.PSIStats `json:psi,omitempty`
167+
// PSI stats of the overall node
168+
PSI cadvisorapi.PSIStats `json:"psi,omitempty"`
166169
}
167170
```
168171

169172
##### Memory
170173
```go
171174
type MemoryStats struct {
172175
// PSI stats of the overall node
173-
PSI cadvisorapi.PSIStats `json:psi,omitempty`
176+
PSI cadvisorapi.PSIStats `json:"psi,omitempty"`
174177
}
175178
```
176179

@@ -179,23 +182,22 @@ type MemoryStats struct {
179182
// IOStats contains data about IO usage.
180183
type IOStats struct {
181184
// The time at which these stats were updated.
182-
Time metav1.Time `json:time`
185+
Time metav1.Time `json:"time"`
183186

184-
// PSI stats of the overall node
185-
PSI cadvisorapi.PSIStats `json:psi,omitempty`
187+
// PSI stats of the overall node
188+
PSI cadvisorapi.PSIStats `json:"psi,omitempty"`
186189
}
187190

188191
type NodeStats struct {
189192
// Stats about the IO pressure of the node
190-
IO *IOStats `json:”io,omitempty”`
191-
193+
IO *IOStats `json:"io,omitempty"`
192194
}
193195
```
194196

195197
#### Phase 2 to add PSI based actions.
196198
**Note:** These actions are tentative, and will depend on different the outcome from testing and discussions with sig-node members, users, and other folks.
197199

198-
1. Introduce a new kubelet config parameter, pressure threshold to let users specify the pressure percentage beyond which the kubelet would report the node condition to disallow workloads to be scheduled on it.
200+
1. Introduce a new kubelet config parameter, pressure threshold, to let users specify the pressure percentage beyond which the kubelet would report the node condition to disallow workloads to be scheduled on it.
199201

200202
2. Add new node conditions corresponding to high PSI (beyond threshold levels) on CPU, Memory and IO.
201203

@@ -205,14 +207,14 @@ type NodeStats struct {
205207
const (
206208
207209
// Conditions based on pressure at system level cgroup.
208-
NodeSystemCPUContentionPressure NodeConditionType = SystemCPUContentionPressure
209-
NodeSystemMemoryContentionPressure NodeConditionType = SystemMemoryContentionPressure
210-
NodeSystemDiskContentionPressure NodeConditionType = SystemDiskContentionPressure
210+
NodeSystemCPUContentionPressure NodeConditionType = "SystemCPUContentionPressure"
211+
NodeSystemMemoryContentionPressure NodeConditionType = "SystemMemoryContentionPressure"
212+
NodeSystemDiskContentionPressure NodeConditionType = "SystemDiskContentionPressure"
211213

212214
// Conditions based on pressure at kubepods level cgroup.
213-
NodeKubepodsCPUContentionPressure NodeConditionType = KubepodsCPUContentionPressure
214-
NodeKubepodsMemoryContentionPressure NodeConditionType = KubepodsMemoryContentionPressure
215-
NodeKubepodsDiskContentionPressure NodeConditionType = KubepodsDiskContentionPressure
215+
NodeKubepodsCPUContentionPressure NodeConditionType = "KubepodsCPUContentionPressure"
216+
NodeKubepodsMemoryContentionPressure NodeConditionType = "KubepodsMemoryContentionPressure"
217+
NodeKubepodsDiskContentionPressure NodeConditionType = "KubepodsDiskContentionPressure"
216218
)
217219
```
218220

@@ -226,13 +228,14 @@ In theory, 10s interval might be rapid to taint a node with NoSchedule effect. T
226228
* If avg60 < threshold for a node tainted with NoSchedule effect, remove the NodeCondition.
227229

228230
4. Collaborate with sig-scheduling to modify TaintNodesByCondition feature to integrate new taints for the new Node Conditions introduced in this enhancement.
229-
node.kubernetes.io/memory-contention-pressure=:NoSchedule
230-
node.kubernetes.io/cpu-contention-pressure=:NoSchedule
231-
node.kubernetes.io/disk-contention-pressure=:NoSchedule
231+
232+
* `node.kubernetes.io/memory-contention-pressure=:NoSchedule`
233+
* `node.kubernetes.io/cpu-contention-pressure=:NoSchedule`
234+
* `node.kubernetes.io/disk-contention-pressure=:NoSchedule`
232235

233236
5. Perform experiments to finalize the default optimal pressure threshold value.
234237

235-
6. Add a new feature gate PSINodeCondition, and guard the node condition related logic behind the feature gate. Set --feature-gates=PSINodeCondition=true to enable the feature.
238+
6. Add a new feature gate PSINodeCondition, and guard the node condition related logic behind the feature gate. Set `--feature-gates=PSINodeCondition=true` to enable the feature.
236239

237240
### Test Plan
238241

@@ -511,7 +514,7 @@ checking if there are objects with field X set) may be a last resort. Avoid
511514
logs or events for this purpose.
512515
-->
513516
For Phase 1:
514-
Use `kubectl get --raw "/api/v1/nodes/{$nodeName}/proxy/stats/summary"`` to call Summary API. If the PSIStats field is seen in the API response,
517+
Use `kubectl get --raw "/api/v1/nodes/{$nodeName}/proxy/stats/summary"` to call Summary API. If the PSIStats field is seen in the API response,
515518
the feature is available to be used by workloads.
516519

517520
For Phase 2:
@@ -664,4 +667,4 @@ additional dependencies
664667

665668
## Infrastructure Needed (Optional)
666669

667-
No new infrastructure is needed.
670+
No new infrastructure is needed.

0 commit comments

Comments
 (0)