You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
-[implement MatchLabelKeys in only either the scheduler plugin or kube-apiserver](#implement-matchlabelkeys-in-only-either-the-scheduler-plugin-or-kube-apiserver)
- `k8s.io/kubernetes/pkg/scheduler/framework/plugins/podtopologyspread`: `2025-01-14 JST (The commit hash: ccd2b4e8a719dabe8605b1e6b2e74bb5352696e1)`- `87.5%`
469
+
- `k8s.io/kubernetes/pkg/scheduler/framework/plugins/podtopologyspread/plugin.go`: `2025-01-14 JST (The commit hash: ccd2b4e8a719dabe8605b1e6b2e74bb5352696e1)`- `84.8%`
470
+
- `k8s.io/kubernetes/pkg/registry/core/pod/strategy.go`: `2025-01-14 JST (The commit hash: ccd2b4e8a719dabe8605b1e6b2e74bb5352696e1)`- `65%`
405
471
406
472
##### Integration tests
407
473
@@ -532,7 +598,9 @@ enhancement:
532
598
533
599
In the event of an upgrade, kube-apiserver will start to accept and store the field `MatchLabelKeys`.
534
600
535
-
In the event of a downgrade, kube-scheduler will ignore `MatchLabelKeys` even if it was set.
601
+
In the event of a downgrade, kube-apiserver will reject pod creation with `matchLabelKeys` in `TopologySpreadConstraint`.
602
+
But, regarding existing pods, we leave `matchLabelKeys` and generated `LabelSelector` even after downgraded.
603
+
kube-scheduler will ignore `MatchLabelKeys` if it was set in the cluster-level default constraints configuration.
536
604
537
605
### Version Skew Strategy
538
606
@@ -548,7 +616,11 @@ enhancement:
548
616
- Will any other components on the node change? For example, changes to CSI,
549
617
CRI or CNI may require updating that component before the kubelet.
550
618
-->
551
-
N/A
619
+
620
+
There's no version skew issue.
621
+
622
+
We changed the implementation design between v1.33 and v1.34, but we designed the change not to involve any version skew issue
623
+
as described at [[v1.33] design change and a safe upgrade path](#v133-design-change-and-a-safe-upgrade-path).
552
624
553
625
## Production Readiness Review Questionnaire
554
626
@@ -619,8 +691,10 @@ NOTE: Also set `disable-supported` to `true` or `false` in `kep.yaml`.
619
691
The feature can be disabled in Alpha and Beta versions by restarting
620
692
kube-apiserver and kube-scheduler with feature-gate off.
621
693
One caveat is that pods that used the feature will continue to have the
622
-
MatchLabelKeys field set even after disabling the feature gate,
623
-
however kube-scheduler will not take the field into account.
694
+
MatchLabelKeys field set and the corresponding LabelSelector even after
695
+
disabling the feature gate.
696
+
In terms of Stable versions, users can choose to opt-out by not setting
697
+
the matchLabelKeys field.
624
698
625
699
###### What happens if we reenable the feature if it was previously rolled back?
626
700
Newly created pods need to follow this policy when scheduling. Old pods will
@@ -659,7 +733,8 @@ feature flags will be enabled on some API servers and not others during the
659
733
rollout. Similarly, consider large clusters and how enablement/disablement
660
734
will rollout across nodes.
661
735
-->
662
-
It won't impact already running workloads because it is an opt-in feature in scheduler.
736
+
It won't impact already running workloads because it is an opt-in feature in kube-apiserver
737
+
and kube-scheduler.
663
738
But during a rolling upgrade, if some apiservers have not enabled the feature, they will not
664
739
be able to accept and store the field "MatchLabelKeys" and the pods associated with these
665
740
apiservers will not be able to use this feature. As a result, pods belonging to the
@@ -765,7 +840,7 @@ Recall that end users cannot usually observe component logs or access metrics.
765
840
-->
766
841
767
842
- [x] Other (treat as last resort)
768
-
- Details: We can determine if this feature is being used by checking deployments that have only `MatchLabelKeys` set in `TopologySpreadConstraint` and no `LabelSelector`. These Deployments will strictly adhere to TopologySpread after both deployment and rolling upgrades if the feature is being used.
843
+
- Details: We can determine if this feature is being used by checking pods that have only `MatchLabelKeys` set in `TopologySpreadConstraint`.
769
844
770
845
###### What are the reasonable SLOs (Service Level Objectives) for the enhancement?
771
846
@@ -896,8 +971,12 @@ Think about adding additional work or introducing new steps in between
Yes. there is an additional work: the scheduler will use the keys in `matchLabelKeys` to look up label values from the pod and AND with `LabelSelector`.
900
-
Maybe result in a very samll impact in scheduling latency which directly contributes to pod-startup-latency SLO.
974
+
Yes. there is an additional work:
975
+
kube-apiserver uses the keys in `matchLabelKeys` to look up label values from the pod,
976
+
and change `LabelSelector` according to them.
977
+
kube-scheduler also handles matchLabelKeys if the cluster-level default constraints has it.
978
+
The impact in the latency of pod creation request in kube-apiserver and the scheduling latency
979
+
should be negligible.
901
980
902
981
###### Will enabling / using this feature result in non-negligible increase of resource usage (CPU, RAM, disk, IO, ...) in any components?
903
982
@@ -937,7 +1016,7 @@ details). For now, we leave it here.
937
1016
938
1017
###### How does this feature react if the API server and/or etcd is unavailable?
939
1018
If the API server and/or etcd is not available, this feature will not be available.
940
-
This is because the scheduler needs to update the scheduling results to the pod via the API server/etcd.
1019
+
This is because the kube-scheduler needs to update the scheduling results to the pod via the API server/etcd.
941
1020
942
1021
###### What are other known failure modes?
943
1022
@@ -963,7 +1042,7 @@ N/A
963
1042
- Check the metric `schedule_attempts_total{result="error|unschedulable"}` to determine if the number
964
1043
of attempts increased. If increased, You need to determine the cause of the failure by the event of
965
1044
the pod. If it's caused by plugin `PodTopologySpread`, You can further analyze this problem by looking
966
-
at the scheduler log.
1045
+
at the kube-scheduler log.
967
1046
968
1047
969
1048
## Implementation History
@@ -981,6 +1060,7 @@ Major milestones might include:
981
1060
- 2022-03-17: Initial KEP
982
1061
- 2022-06-08: KEP merged
983
1062
- 2023-01-16: Graduate to Beta
1063
+
- 2025-01-23: Change the implementation design to be aligned with PodAffinity's `matchLabelKeys`
984
1064
985
1065
## Drawbacks
986
1066
@@ -996,11 +1076,28 @@ not need to be as detailed as the proposal, but should include enough
996
1076
information to express the idea and why it was not acceptable.
997
1077
-->
998
1078
1079
+
### use pod generateName
999
1080
Use `pod.generateName` to distinguish new/old pods that belong to the
1000
1081
revisions of the same workload in scheduler plugin. It's decided not to
1001
1082
support because of the following reason: scheduler needs to ensure universal
1002
1083
and scheduler plugin shouldn't have special treatment for any labels/fields.
1003
1084
1085
+
### implement MatchLabelKeys in only either the scheduler plugin or kube-apiserver
1086
+
Technically, we can implement this feature within the PodTopologySpread plugin only;
1087
+
merging the key-value labels corresponding to `MatchLabelKeys` into `LabelSelector` internally
1088
+
within the plugin before calculating the scheduling results.
1089
+
This is the actual implementation up to 1.32.
1090
+
But, it may confuse users because this behavior would be different from PodAffinity's `MatchLabelKeys`.
1091
+
1092
+
Also, we cannot implement this feature only within kube-apiserver because it'd make it
1093
+
impossible to handle `MatchLabelKeys` within the cluster-level default constraints
1094
+
in the scheduler configuration in the future (see https://github.com/kubernetes/kubernetes/issues/129198).
1095
+
1096
+
So we decided to go with the design that implements this feature within both
1097
+
the PodTopologySpread plugin and kube-apiserver.
1098
+
Although the final design has a downside requiring us to maintain two implementations handling `MatchLabelKeys`,
1099
+
each implementation is simple and we regard the risk of increased maintenance overhead as fairly low.
0 commit comments