Skip to content

Commit 57f8ebd

Browse files
authored
Merge pull request #4450 from sanposhiho/matchlabelkeys
KEP-3633: graduate matchLabelKeys/mismatchLabelKeys to beta
2 parents a03be2f + 7beaaf9 commit 57f8ebd

File tree

3 files changed

+118
-20
lines changed

3 files changed

+118
-20
lines changed
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
11
kep-number: 3633
22
alpha:
33
approver: "@wojtek-t"
4+
beta:
5+
approver: "@wojtek-t"

keps/sig-scheduling/3633-matchlabelkeys-to-podaffinity/README.md

Lines changed: 114 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -492,14 +492,10 @@ https://storage.googleapis.com/k8s-triage/index.html
492492
- `matchLabelKeys`/`mismatchLabelKeys` with the feature gate enabled/disabled.
493493

494494
**Filter**
495-
- `k8s.io/kubernetes/test/integration/scheduler/filters/filters_test.go`: https://storage.googleapis.com/k8s-triage/index.html?test=TestPodTopologySpreadFilter
495+
- [`k8s.io/kubernetes/test/integration/scheduler/filters/filters_test.go`](https://github.com/kubernetes/kubernetes/blob/3a4c35cc89c0ce132f8f5962ce4b9a48fae77873/test/integration/scheduler/filters/filters_test.go#L807-L1069): https://storage.googleapis.com/k8s-triage/index.html?test=TestPodTopologySpreadFilter
496496

497497
**Score**
498-
- `k8s.io/kubernetes/test/integration/scheduler/scoring/priorities_test.go`: https://storage.googleapis.com/k8s-triage/index.html?test=TestPodTopologySpreadScoring
499-
500-
Also, we should make sure this feature brings no significant performance degradation in both Filter and Score.
501-
502-
- `k8s.io/kubernetes/test/integration/scheduler_perf/scheduler_perf_test.go`: https://storage.googleapis.com/k8s-triage/index.html?test=BenchmarkPerfScheduling
498+
- [`k8s.io/kubernetes/test/integration/scheduler/scoring/priorities_test.go`](https://github.com/kubernetes/kubernetes/blob/master/test/integration/scheduler/scoring/priorities_test.go#L383-L643): https://storage.googleapis.com/k8s-triage/index.html?test=TestPodTopologySpreadScoring
503499

504500
##### e2e tests
505501

@@ -738,7 +734,8 @@ You can take a look at one potential example of such test in:
738734
https://github.com/kubernetes/kubernetes/pull/97058/files#diff-7826f7adbc1996a05ab52e3f5f02429e94b68ce6bce0dc534d1be636154fded3R246-R282
739735
-->
740736

741-
No. But, the tests to confirm the behavior on switching the feature gate will be added.
737+
No. But, the tests to confirm the behavior on switching the feature gate will be added,
738+
https://github.com/kubernetes/kubernetes/issues/123156.
742739

743740
### Rollout, Upgrade and Rollback Planning
744741

@@ -758,13 +755,29 @@ rollout. Similarly, consider large clusters and how enablement/disablement
758755
will rollout across nodes.
759756
-->
760757

758+
It shouldn't impact already running workloads. It's an opt-in feature,
759+
and users need to set `matchLabelKeys` or `mismatchLabelKeys` field in PodAffinity or PodAntiAffinity to use this feature.
760+
761+
When this feature is disabled by the feature flag,
762+
the already created Pod's `matchLabelKeys`/`mismatchLabelKeys` is kept and the `labelSelector` is not modified back.
763+
But, the newly created Pod's `matchLabelKeys` or `mismatchLabelKeys` field is ignored and silently dropped.
764+
761765
###### What specific metrics should inform a rollback?
762766

763767
<!--
764768
What signals should users be paying attention to when the feature is young
765769
that might indicate a serious problem?
766770
-->
767771

772+
- A spike on metric `schedule_attempts_total{result="error|unschedulable"}` when pods using this feature are added.
773+
774+
The only possibility of the bug is in the Pod creation process in kube-apiserver and it results in some unintended scheduling.
775+
776+
Also, the scheduler's latency may also get increased
777+
because of the additional calculation for the label selector made from `matchLabelKeys`/`mismatchLabelKeys`.
778+
But, it should be tiny increase because the scheduler doesn't get changed at all for this feature,
779+
and using `matchLabelKeys`/`mismatchLabelKeys` just equals to adding some Pods with additional label selectors to the cluster.
780+
768781
###### Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?
769782

770783
<!--
@@ -773,12 +786,32 @@ Longer term, we may want to require automated upgrade/rollback tests, but we
773786
are missing a bunch of machinery and tooling and can't do that now.
774787
-->
775788

789+
We'll test it via this scenario:
790+
791+
1. start kube-apiserver with this feature disabled.
792+
2. create one Pod with `matchLabelKeys` in PodAffinity.
793+
3. No change should be observed in `labelSelector` in PodAffinity because the feature is disabled.
794+
4. restart kube-apiserver with this feature enabled. (**First enablement**)
795+
5. No change in Pod created at (2).
796+
6. create one Pod with `matchLabelKeys` in PodAffinity.
797+
7. `labelSelector` in PodAffinity should be changed based on `matchLabelKeys` and label value in the Pod because the feature is enabled.
798+
8. restart kube-apiserver with this feature disabled. (**First disablement**)
799+
9. No change in Pods created before.
800+
10. create one Pod with `matchLabelKeys` in PodAffinity.
801+
11. No change should be observed in `labelSelector` in PodAffinity because the feature is disabled.
802+
12. restart kube-apiserver with this feature enabled. (**Second enablement**)
803+
13. No change in Pods created before.
804+
14. create one Pod with `matchLabelKeys` in PodAffinity.
805+
15. `labelSelector` in PodAffinity should be changed based on `matchLabelKeys` and label value in the Pod because the feature is enabled.
806+
776807
###### Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.?
777808

778809
<!--
779810
Even if applying deprecation policies, they may still surprise some users.
780811
-->
781812

813+
No.
814+
782815
### Monitoring Requirements
783816

784817
<!--
@@ -796,6 +829,8 @@ checking if there are objects with field X set) may be a last resort. Avoid
796829
logs or events for this purpose.
797830
-->
798831

832+
The operator can query pods with `matchLabelKeys` or `mismatchLabelKeys` field set in PodAffinity or PodAntiAffinity.
833+
799834
###### How can someone using this feature know that it is working for their instance?
800835

801836
<!--
@@ -807,13 +842,13 @@ and operation of this feature.
807842
Recall that end users cannot usually observe component logs or access metrics.
808843
-->
809844

810-
- [ ] Events
811-
- Event Reason:
812-
- [ ] API .status
813-
- Condition name:
814-
- Other field:
815-
- [ ] Other (treat as last resort)
845+
- [x] Other (treat as last resort)
816846
- Details:
847+
This feature doesn't cause any logs, any events, any pod status updates.
848+
But, people can determine it's being evaluated by looking at `labelSelector` in PodAffinity or PodAntiAffinity
849+
in which they set `matchLabelKey` or `mismatchLabelKeys`.
850+
If `labelSelector` is modified after Pods' creation, this feature is working correctly.
851+
817852

818853
###### What are the reasonable SLOs (Service Level Objectives) for the enhancement?
819854

@@ -832,18 +867,20 @@ These goals will help you determine what you need to measure (SLIs) in the next
832867
question.
833868
-->
834869

870+
Metric `plugin_execution_duration_seconds{plugin="InterPodAffinity"}` <= 100ms on 90-percentile.
871+
872+
This feature shouldn't change the latency of InterPodAffinity plugin at all.
873+
835874
###### What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service?
836875

837876
<!--
838877
Pick one more of these and delete the rest.
839878
-->
840879

841-
- [ ] Metrics
880+
- [x] Metrics
842881
- Metric name:
843-
- [Optional] Aggregation method:
844-
- Components exposing the metric:
845-
- [ ] Other (treat as last resort)
846-
- Details:
882+
- Metric name: `schedule_attempts_total{result="error|unschedulable"}`
883+
- Components exposing the metric: kube-scheduler
847884

848885
###### Are there any missing metrics that would be useful to have to improve observability of this feature?
849886

@@ -852,6 +889,8 @@ Describe the metrics themselves and the reasons why they weren't added (e.g., co
852889
implementation difficulties, etc.).
853890
-->
854891

892+
No.
893+
855894
### Dependencies
856895

857896
<!--
@@ -875,6 +914,8 @@ and creating new ones, as well as about cluster-level services (e.g. DNS):
875914
- Impact of its degraded performance or high-error rates on the feature:
876915
-->
877916

917+
No.
918+
878919
### Scalability
879920

880921
<!--
@@ -902,6 +943,8 @@ Focusing mostly on:
902943
heartbeats, leader election, etc.)
903944
-->
904945

946+
No.
947+
905948
###### Will enabling / using this feature result in introducing new API types?
906949

907950
<!--
@@ -911,6 +954,8 @@ Describe them, providing:
911954
- Supported number of objects per namespace (for namespace-scoped objects)
912955
-->
913956

957+
No.
958+
914959
###### Will enabling / using this feature result in any new calls to the cloud provider?
915960

916961
<!--
@@ -919,6 +964,8 @@ Describe them, providing:
919964
- Estimated increase:
920965
-->
921966

967+
No.
968+
922969
###### Will enabling / using this feature result in increasing size or count of the existing API objects?
923970

924971
<!--
@@ -928,6 +975,9 @@ Describe them, providing:
928975
- Estimated amount of new objects: (e.g., new Object X for every existing Pod)
929976
-->
930977

978+
No.
979+
980+
931981
###### Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs?
932982

933983
<!--
@@ -939,6 +989,10 @@ Think about adding additional work or introducing new steps in between
939989
[existing SLIs/SLOs]: https://git.k8s.io/community/sig-scalability/slos/slos.md#kubernetes-slisslos
940990
-->
941991

992+
Yes. there is an additional work: kube-apiserver uses the keys in `matchLabelKeys` or `mismatchLabelKeys` to look up label values from the pod,
993+
and change `labelSelector` according to them.
994+
The impact in the latency of pod creation request in kube-apiserver should be negligible.
995+
942996
###### Will enabling / using this feature result in non-negligible increase of resource usage (CPU, RAM, disk, IO, ...) in any components?
943997

944998
<!--
@@ -951,6 +1005,35 @@ This through this both in small and large cases, again with respect to the
9511005
[supported limits]: https://git.k8s.io/community//sig-scalability/configs-and-limits/thresholds.md
9521006
-->
9531007

1008+
No.
1009+
1010+
1011+
###### Can enabling / using this feature result in resource exhaustion of some node resources (PIDs, sockets, inodes, etc.)?
1012+
1013+
<!--
1014+
Focus not just on happy cases, but primarily on more pathological cases
1015+
(e.g. probes taking a minute instead of milliseconds, failed pods consuming resources, etc.).
1016+
If any of the resources can be exhausted, how this is mitigated with the existing limits
1017+
(e.g. pods per node) or new limits added by this KEP?
1018+
1019+
Are there any tests that were run/should be run to understand performance characteristics better
1020+
and validate the declared limits?
1021+
-->
1022+
No.
1023+
1024+
###### Can enabling / using this feature result in resource exhaustion of some node resources (PIDs, sockets, inodes, etc.)?
1025+
1026+
<!--
1027+
Focus not just on happy cases, but primarily on more pathological cases
1028+
(e.g. probes taking a minute instead of milliseconds, failed pods consuming resources, etc.).
1029+
If any of the resources can be exhausted, how this is mitigated with the existing limits
1030+
(e.g. pods per node) or new limits added by this KEP?
1031+
Are there any tests that were run/should be run to understand performance characteristics better
1032+
and validate the declared limits?
1033+
-->
1034+
1035+
No.
1036+
9541037
### Troubleshooting
9551038

9561039
<!--
@@ -966,6 +1049,9 @@ details). For now, we leave it here.
9661049

9671050
###### How does this feature react if the API server and/or etcd is unavailable?
9681051

1052+
If the API server and/or etcd is not available, this feature will not be available.
1053+
This is because this feature depends on Pods creation.
1054+
9691055
###### What are other known failure modes?
9701056

9711057
<!--
@@ -981,8 +1067,17 @@ For each of them, fill in the following information by copying the below templat
9811067
- Testing: Are there any tests for failure mode? If not, describe why.
9821068
-->
9831069

1070+
N/A.
1071+
9841072
###### What steps should be taken if SLOs are not being met to determine the problem?
9851073

1074+
- Check the metric `schedule_attempts_total{result="error|unschedulable"}` to determine if the number
1075+
of attempts increased.
1076+
If increased, You need to determine the cause of the failure by the event of
1077+
the pod. If it's caused by plugin `InterPodAffinity`, you can further analyze this problem by looking
1078+
at `labelSelector` in PodAffinity/PodAntiAffinity with `matchLabelKeys`/`mismatchLabelKeys`.
1079+
If `labelSelector` seems to be updated incorrectly, it's the problem in this feature.
1080+
9861081
## Implementation History
9871082

9881083
<!--
@@ -998,6 +1093,7 @@ Major milestones might include:
9981093

9991094
- 2022-11-09: Initial KEP PR is submitted.
10001095
- 2023-05-14 / 2023-06-08: PRs to change it from MatchLabelKeys to MatchLabelSelector are submitted. (to satisfy the user story 2)
1096+
- 2024-01-28: The PR to update KEP for beta is submitted.
10011097

10021098
## Drawbacks
10031099

keps/sig-scheduling/3633-matchlabelkeys-to-podaffinity/kep.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -13,9 +13,9 @@ approvers:
1313
see-also:
1414
- "/keps/sig-scheduling/3243-respect-pod-topology-spread-after-rolling-upgrades"
1515

16-
stage: alpha
16+
stage: beta
1717

18-
latest-milestone: "v1.29"
18+
latest-milestone: "v1.30"
1919

2020
milestone:
2121
alpha: "v1.29"

0 commit comments

Comments
 (0)