@@ -492,14 +492,10 @@ https://storage.googleapis.com/k8s-triage/index.html
492
492
- ` matchLabelKeys` /`mismatchLabelKeys` with the feature gate enabled/disabled.
493
493
494
494
**Filter**
495
- - `k8s.io/kubernetes/test/integration/scheduler/filters/filters_test.go` : https://storage.googleapis.com/k8s-triage/index.html?test=TestPodTopologySpreadFilter
495
+ - [ `k8s.io/kubernetes/test/integration/scheduler/filters/filters_test.go`](https://github.com/kubernetes/kubernetes/blob/3a4c35cc89c0ce132f8f5962ce4b9a48fae77873/test/integration/scheduler/filters/filters_test.go#L807-L1069) : https://storage.googleapis.com/k8s-triage/index.html?test=TestPodTopologySpreadFilter
496
496
497
497
**Score**
498
- - `k8s.io/kubernetes/test/integration/scheduler/scoring/priorities_test.go` : https://storage.googleapis.com/k8s-triage/index.html?test=TestPodTopologySpreadScoring
499
-
500
- Also, we should make sure this feature brings no significant performance degradation in both Filter and Score.
501
-
502
- - `k8s.io/kubernetes/test/integration/scheduler_perf/scheduler_perf_test.go` : https://storage.googleapis.com/k8s-triage/index.html?test=BenchmarkPerfScheduling
498
+ - [`k8s.io/kubernetes/test/integration/scheduler/scoring/priorities_test.go`](https://github.com/kubernetes/kubernetes/blob/master/test/integration/scheduler/scoring/priorities_test.go#L383-L643): https://storage.googleapis.com/k8s-triage/index.html?test=TestPodTopologySpreadScoring
503
499
504
500
# #### e2e tests
505
501
@@ -738,7 +734,8 @@ You can take a look at one potential example of such test in:
738
734
https://github.com/kubernetes/kubernetes/pull/97058/files#diff-7826f7adbc1996a05ab52e3f5f02429e94b68ce6bce0dc534d1be636154fded3R246-R282
739
735
-->
740
736
741
- No. But, the tests to confirm the behavior on switching the feature gate will be added.
737
+ No. But, the tests to confirm the behavior on switching the feature gate will be added,
738
+ https://github.com/kubernetes/kubernetes/issues/123156.
742
739
743
740
# ## Rollout, Upgrade and Rollback Planning
744
741
@@ -758,13 +755,29 @@ rollout. Similarly, consider large clusters and how enablement/disablement
758
755
will rollout across nodes.
759
756
-->
760
757
758
+ It shouldn't impact already running workloads. It's an opt-in feature,
759
+ and users need to set `matchLabelKeys` or `mismatchLabelKeys` field in PodAffinity or PodAntiAffinity to use this feature.
760
+
761
+ When this feature is disabled by the feature flag,
762
+ the already created Pod's `matchLabelKeys`/`mismatchLabelKeys` is kept and the `labelSelector` is not modified back.
763
+ But, the newly created Pod's `matchLabelKeys` or `mismatchLabelKeys` field is ignored and silently dropped.
764
+
761
765
# ##### What specific metrics should inform a rollback?
762
766
763
767
<!--
764
768
What signals should users be paying attention to when the feature is young
765
769
that might indicate a serious problem?
766
770
-->
767
771
772
+ - A spike on metric `schedule_attempts_total{result="error|unschedulable"}` when pods using this feature are added.
773
+
774
+ The only possibility of the bug is in the Pod creation process in kube-apiserver and it results in some unintended scheduling.
775
+
776
+ Also, the scheduler's latency may also get increased
777
+ because of the additional calculation for the label selector made from `matchLabelKeys`/`mismatchLabelKeys`.
778
+ But, it should be tiny increase because the scheduler doesn't get changed at all for this feature,
779
+ and using `matchLabelKeys`/`mismatchLabelKeys` just equals to adding some Pods with additional label selectors to the cluster.
780
+
768
781
# ##### Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?
769
782
770
783
<!--
@@ -773,12 +786,32 @@ Longer term, we may want to require automated upgrade/rollback tests, but we
773
786
are missing a bunch of machinery and tooling and can't do that now.
774
787
-->
775
788
789
+ We'll test it via this scenario :
790
+
791
+ 1. start kube-apiserver with this feature disabled.
792
+ 2. create one Pod with `matchLabelKeys` in PodAffinity.
793
+ 3. No change should be observed in `labelSelector` in PodAffinity because the feature is disabled.
794
+ 4. restart kube-apiserver with this feature enabled. (**First enablement**)
795
+ 5. No change in Pod created at (2).
796
+ 6. create one Pod with `matchLabelKeys` in PodAffinity.
797
+ 7. `labelSelector` in PodAffinity should be changed based on `matchLabelKeys` and label value in the Pod because the feature is enabled.
798
+ 8. restart kube-apiserver with this feature disabled. (**First disablement**)
799
+ 9. No change in Pods created before.
800
+ 10. create one Pod with `matchLabelKeys` in PodAffinity.
801
+ 11. No change should be observed in `labelSelector` in PodAffinity because the feature is disabled.
802
+ 12. restart kube-apiserver with this feature enabled. (**Second enablement**)
803
+ 13. No change in Pods created before.
804
+ 14. create one Pod with `matchLabelKeys` in PodAffinity.
805
+ 15. `labelSelector` in PodAffinity should be changed based on `matchLabelKeys` and label value in the Pod because the feature is enabled.
806
+
776
807
# ##### Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.?
777
808
778
809
<!--
779
810
Even if applying deprecation policies, they may still surprise some users.
780
811
-->
781
812
813
+ No.
814
+
782
815
# ## Monitoring Requirements
783
816
784
817
<!--
@@ -796,6 +829,8 @@ checking if there are objects with field X set) may be a last resort. Avoid
796
829
logs or events for this purpose.
797
830
-->
798
831
832
+ The operator can query pods with `matchLabelKeys` or `mismatchLabelKeys` field set in PodAffinity or PodAntiAffinity.
833
+
799
834
# ##### How can someone using this feature know that it is working for their instance?
800
835
801
836
<!--
@@ -807,13 +842,13 @@ and operation of this feature.
807
842
Recall that end users cannot usually observe component logs or access metrics.
808
843
-->
809
844
810
- - [ ] Events
811
- - Event Reason :
812
- - [ ] API .status
813
- - Condition name :
814
- - Other field :
815
- - [ ] Other (treat as last resort)
845
+ - [x] Other (treat as last resort)
816
846
- Details :
847
+ This feature doesn't cause any logs, any events, any pod status updates.
848
+ But, people can determine it's being evaluated by looking at `labelSelector` in PodAffinity or PodAntiAffinity
849
+ in which they set `matchLabelKey` or `mismatchLabelKeys`.
850
+ If `labelSelector` is modified after Pods' creation, this feature is working correctly.
851
+
817
852
818
853
# ##### What are the reasonable SLOs (Service Level Objectives) for the enhancement?
819
854
@@ -832,18 +867,20 @@ These goals will help you determine what you need to measure (SLIs) in the next
832
867
question.
833
868
-->
834
869
870
+ Metric `plugin_execution_duration_seconds{plugin="InterPodAffinity"}` <= 100ms on 90-percentile.
871
+
872
+ This feature shouldn't change the latency of InterPodAffinity plugin at all.
873
+
835
874
# ##### What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service?
836
875
837
876
<!--
838
877
Pick one more of these and delete the rest.
839
878
-->
840
879
841
- - [ ] Metrics
880
+ - [x ] Metrics
842
881
- Metric name :
843
- - [Optional] Aggregation method :
844
- - Components exposing the metric :
845
- - [ ] Other (treat as last resort)
846
- - Details :
882
+ - Metric name : ` schedule_attempts_total{result="error|unschedulable"}`
883
+ - Components exposing the metric : kube-scheduler
847
884
848
885
# ##### Are there any missing metrics that would be useful to have to improve observability of this feature?
849
886
@@ -852,6 +889,8 @@ Describe the metrics themselves and the reasons why they weren't added (e.g., co
852
889
implementation difficulties, etc.).
853
890
-->
854
891
892
+ No.
893
+
855
894
# ## Dependencies
856
895
857
896
<!--
@@ -875,6 +914,8 @@ and creating new ones, as well as about cluster-level services (e.g. DNS):
875
914
- Impact of its degraded performance or high-error rates on the feature :
876
915
-->
877
916
917
+ No.
918
+
878
919
# ## Scalability
879
920
880
921
<!--
@@ -902,6 +943,8 @@ Focusing mostly on:
902
943
heartbeats, leader election, etc.)
903
944
-->
904
945
946
+ No.
947
+
905
948
# ##### Will enabling / using this feature result in introducing new API types?
906
949
907
950
<!--
@@ -911,6 +954,8 @@ Describe them, providing:
911
954
- Supported number of objects per namespace (for namespace-scoped objects)
912
955
-->
913
956
957
+ No.
958
+
914
959
# ##### Will enabling / using this feature result in any new calls to the cloud provider?
915
960
916
961
<!--
@@ -919,6 +964,8 @@ Describe them, providing:
919
964
- Estimated increase :
920
965
-->
921
966
967
+ No.
968
+
922
969
# ##### Will enabling / using this feature result in increasing size or count of the existing API objects?
923
970
924
971
<!--
@@ -928,6 +975,9 @@ Describe them, providing:
928
975
- Estimated amount of new objects : (e.g., new Object X for every existing Pod)
929
976
-->
930
977
978
+ No.
979
+
980
+
931
981
# ##### Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs?
932
982
933
983
<!--
@@ -939,6 +989,10 @@ Think about adding additional work or introducing new steps in between
939
989
[existing SLIs/SLOs] : https://git.k8s.io/community/sig-scalability/slos/slos.md#kubernetes-slisslos
940
990
-->
941
991
992
+ Yes. there is an additional work : kube-apiserver uses the keys in `matchLabelKeys` or `mismatchLabelKeys` to look up label values from the pod,
993
+ and change `labelSelector` according to them.
994
+ The impact in the latency of pod creation request in kube-apiserver should be negligible.
995
+
942
996
# ##### Will enabling / using this feature result in non-negligible increase of resource usage (CPU, RAM, disk, IO, ...) in any components?
943
997
944
998
<!--
@@ -951,6 +1005,35 @@ This through this both in small and large cases, again with respect to the
951
1005
[supported limits] : https://git.k8s.io/community//sig-scalability/configs-and-limits/thresholds.md
952
1006
-->
953
1007
1008
+ No.
1009
+
1010
+
1011
+ # ##### Can enabling / using this feature result in resource exhaustion of some node resources (PIDs, sockets, inodes, etc.)?
1012
+
1013
+ <!--
1014
+ Focus not just on happy cases, but primarily on more pathological cases
1015
+ (e.g. probes taking a minute instead of milliseconds, failed pods consuming resources, etc.).
1016
+ If any of the resources can be exhausted, how this is mitigated with the existing limits
1017
+ (e.g. pods per node) or new limits added by this KEP?
1018
+
1019
+ Are there any tests that were run/should be run to understand performance characteristics better
1020
+ and validate the declared limits?
1021
+ -->
1022
+ No.
1023
+
1024
+ # ##### Can enabling / using this feature result in resource exhaustion of some node resources (PIDs, sockets, inodes, etc.)?
1025
+
1026
+ <!--
1027
+ Focus not just on happy cases, but primarily on more pathological cases
1028
+ (e.g. probes taking a minute instead of milliseconds, failed pods consuming resources, etc.).
1029
+ If any of the resources can be exhausted, how this is mitigated with the existing limits
1030
+ (e.g. pods per node) or new limits added by this KEP?
1031
+ Are there any tests that were run/should be run to understand performance characteristics better
1032
+ and validate the declared limits?
1033
+ -->
1034
+
1035
+ No.
1036
+
954
1037
# ## Troubleshooting
955
1038
956
1039
<!--
@@ -966,6 +1049,9 @@ details). For now, we leave it here.
966
1049
967
1050
# ##### How does this feature react if the API server and/or etcd is unavailable?
968
1051
1052
+ If the API server and/or etcd is not available, this feature will not be available.
1053
+ This is because this feature depends on Pods creation.
1054
+
969
1055
# ##### What are other known failure modes?
970
1056
971
1057
<!--
@@ -981,8 +1067,17 @@ For each of them, fill in the following information by copying the below templat
981
1067
- Testing : Are there any tests for failure mode? If not, describe why.
982
1068
-->
983
1069
1070
+ N/A.
1071
+
984
1072
# ##### What steps should be taken if SLOs are not being met to determine the problem?
985
1073
1074
+ - Check the metric `schedule_attempts_total{result="error|unschedulable"}` to determine if the number
1075
+ of attempts increased.
1076
+ If increased, You need to determine the cause of the failure by the event of
1077
+ the pod. If it's caused by plugin `InterPodAffinity`, you can further analyze this problem by looking
1078
+ at `labelSelector` in PodAffinity/PodAntiAffinity with `matchLabelKeys`/`mismatchLabelKeys`.
1079
+ If `labelSelector` seems to be updated incorrectly, it's the problem in this feature.
1080
+
986
1081
# # Implementation History
987
1082
988
1083
<!--
@@ -998,6 +1093,7 @@ Major milestones might include:
998
1093
999
1094
- 2022-11-09 : Initial KEP PR is submitted.
1000
1095
- 2023-05-14 / 2023-06-08 : PRs to change it from MatchLabelKeys to MatchLabelSelector are submitted. (to satisfy the user story 2)
1096
+ - 2024-01-28 : The PR to update KEP for beta is submitted.
1001
1097
1002
1098
# # Drawbacks
1003
1099
0 commit comments