You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: keps/sig-scheduling/4832-async-preemption/README.md
+8-6Lines changed: 8 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -241,15 +241,15 @@ We'll add test cases that multiple pods are trigger preemption.
241
241
**Upgrade**
242
242
243
243
During the alpha period, users have to enable the feature gate `SchedulerAsyncPreemption` to opt in this feature.
244
-
This is purely internal feature for kube-scheduler, so no other special actions are required outside the scheduler.
244
+
This is purely in-memory feature for kube-scheduler, so no other special actions are required outside the scheduler.
245
245
246
246
**Downgrade**
247
247
248
248
Users need to disable the feature gate.
249
249
250
250
### Version Skew Strategy
251
251
252
-
This is purely internal feature for kube-scheduler, and hence no version skew strategy.
252
+
This is purely in-memory feature for kube-scheduler, and hence no version skew strategy.
253
253
254
254
## Production Readiness Review Questionnaire
255
255
@@ -269,7 +269,7 @@ This is purely internal feature for kube-scheduler, and hence no version skew st
269
269
270
270
###### Does enabling the feature change any default behavior?
271
271
272
-
No.
272
+
No. The feature is a performance optimization that affects every Pod that needs preemption, but there are no functional changes: the result of the preemption is the same.
273
273
But, like mentioned in [When kube-apiserver is unstable](#when-kube-apiserver-is-unstable), scheduling results could be different.
274
274
275
275
###### Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)?
@@ -284,7 +284,7 @@ The scheduler again starts to run PostFilter asynchronously.
284
284
285
285
###### Are there any tests for feature enablement/disablement?
286
286
287
-
Given it's purely internal feature and enablement/disablement requires restarting the component (to change the value of feature flag),
287
+
Given it's purely in-memory feature and enablement/disablement requires restarting the component (to change the value of feature flag),
288
288
having feature tests is enough.
289
289
290
290
### Rollout, Upgrade and Rollback Planning
@@ -319,6 +319,8 @@ No.
319
319
###### How can an operator determine if the feature is in use by workloads?
320
320
321
321
This feature is used during all Pods' preemption if the feature gate is enabled.
322
+
You can see if the scheduler triggers any preemptions via `preemption_attempts_total` metric.
323
+
322
324
You can find Pods that have triggered the preemption by referring to `.Status.NominatedNodeName`,
323
325
and Pods that have been preempted by referring to their condition with `type: DisruptionTarget` and `reason: PreemptionByScheduler`.
324
326
@@ -339,8 +341,8 @@ and Pods that have been preempted by referring to their condition with `type: Di
339
341
340
342
###### Are there any missing metrics that would be useful to have to improve observability of this feature?
341
343
342
-
-`goroutines_duration_seconds` (w/ label: `operation`): to observe how many preemption goroutines have failed.
343
-
-`goroutines_execution_total` (w/ labels: `operation`, `result`): to observe how long each preemption goroutine takes to complete.
344
+
-`goroutines_duration_seconds` (w/ label: `operation`): to observe how long each preemption goroutine takes to complete.
345
+
-`goroutines_execution_total` (w/ labels: `operation`, `result`): to observe how many preemption goroutines have failed.
0 commit comments