@@ -528,7 +528,7 @@ This can inform certain test coverage improvements that we want to do before
528
528
extending the production code to implement this enhancement.
529
529
-->
530
530
531
- - `k8s.io/kubernetes/pkg/scheduler/internal/queue`: `10-01 20:28 JST ` - `88.4 `
531
+ - `k8s.io/kubernetes/pkg/scheduler/internal/queue`: `2024-09-26 ` - `92.8 `
532
532
533
533
##### Integration tests
534
534
@@ -594,15 +594,16 @@ n/a
594
594
- The feature gate is implemented. (disabled by default)
595
595
- QueueingHint implementation in all plugins.
596
596
- The integration tests are implemented for requeueing scenarios in all plugins.
597
- - `PreCheck` feature in the scheduling queue is completely removed.
598
- - No significant degradation in memory comsumption.
599
- - No performance degradation is confirmed via scheduler_perf.
597
+ - `PreCheck` feature in the scheduling queue is disabled when SchedulerQueueingHints is enabled.
598
+ - No significant degradation in memory comsumption based on `scheduler_inflight_events` metric.
599
+ - scheduler_perf covers the performance of most QueueingHintFn for in-tree plugins.
600
+ - scheduler_perf runs with QueueingHint both enabled and disabled for all test cases and throughput when enabled is better or, at least, comparable.
601
+ - Event handling duration is monitored using scheduler_perf.
600
602
- The feature gate is enabled by default.
601
- - No bug report for a while after enabling it by default.
602
603
603
604
#### GA
604
605
605
- - No bug report for a while after reaching Beta.
606
+ - No bug report for a while after reaching Beta and enabling it by default .
606
607
607
608
### Upgrade / Downgrade Strategy
608
609
@@ -777,6 +778,10 @@ that might indicate a serious problem?
777
778
Maybe something goes wrong with QueueingHint and Pods are stuck in the queue if
778
779
- `scheduler_pending_pods` metric with `queue: unschedulable` label grows and keeps high number abnormally
779
780
- `pod_scheduling_sli_duration_seconds` metric grows abnormally
781
+ Probably inFlightEvents list is not cleaning up properly when
782
+ - `scheduler_inflight_events` metric grows abnormally as well as isn't close to 0 when no scheduling is happening
783
+ There could be a problem with QueueingHint performance if
784
+ - `scheduler_queueing_hint_execution_duration_seconds` and `scheduler_event_handling_duration_seconds` metrics are unexpectedly high
780
785
781
786
###### Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?
782
787
@@ -864,6 +869,9 @@ Pick one more of these and delete the rest.
864
869
- `schedule_attempts_total`
865
870
- `scheduling_algorithm_duration_seconds`
866
871
- `scheduler_pending_pods` with `queue: unschedulable`
872
+ - `scheduler_inflight_events`
873
+ - `scheduler_queueing_hint_execution_duration_seconds`
874
+ - `scheduler_event_handling_duration_seconds`
867
875
- Components exposing the metric: kube-scheduler
868
876
869
877
###### Are there any missing metrics that would be useful to have to improve observability of this feature?
@@ -1063,6 +1071,7 @@ Major milestones might include:
1063
1071
- Oct 01, 2023: The initial KEP is submitted.
1064
1072
- Dec 13, 2023: The feature gate is changed to be disabled by default.
1065
1073
- Dec 31, 2023: The KEP is updated based on the situation as of v1.30 release cycle. The beta/GA criteria is sorted.
1074
+ - Sep 26, 2024: The KEP is updated as QueueingHint is targeting to be enabled by default in the v1.32 release.
1066
1075
1067
1076
## Drawbacks
1068
1077
0 commit comments