Skip to content

Commit 4906e22

Browse files
committed
fix2
1 parent c6e0a76 commit 4906e22

File tree

1 file changed

+32
-26
lines changed
  • keps/sig-scheduling/3243-respect-pod-topology-spread-after-rolling-upgrades

1 file changed

+32
-26
lines changed

keps/sig-scheduling/3243-respect-pod-topology-spread-after-rolling-upgrades/README.md

Lines changed: 32 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -88,6 +88,7 @@ tags, and then generate with `hack/update-toc.sh`.
8888
- [Notes/Constraints/Caveats (Optional)](#notesconstraintscaveats-optional)
8989
- [Risks and Mitigations](#risks-and-mitigations)
9090
- [Design Details](#design-details)
91+
- [[v1.33] design change and a safe upgrade path](#v133-design-change-and-a-safe-upgrade-path)
9192
- [Test Plan](#test-plan)
9293
- [Prerequisite testing updates](#prerequisite-testing-updates)
9394
- [Unit tests](#unit-tests)
@@ -110,7 +111,7 @@ tags, and then generate with `hack/update-toc.sh`.
110111
- [Drawbacks](#drawbacks)
111112
- [Alternatives](#alternatives)
112113
- [use pod generateName](#use-pod-generatename)
113-
- [remove MatchLabelKeys implementation from the scheduler plugin](#remove-matchlabelkeys-implementation-from-the-scheduler-plugin)
114+
- [implement MatchLabelKeys in only either the scheduler plugin or kube-apiserver](#implement-matchlabelkeys-in-only-either-the-scheduler-plugin-or-kube-apiserver)
114115
- [Infrastructure Needed (Optional)](#infrastructure-needed-optional)
115116
<!-- /toc -->
116117

@@ -183,11 +184,9 @@ know the exact label key and value when defining the pod spec.
183184
This KEP proposes a complementary field to LabelSelector named `MatchLabelKeys` in
184185
`TopologySpreadConstraint` which represent a set of label keys only.
185186
kube-apiserver will use those keys to look up label values from the incoming pod
186-
and those labels are merged to `LabelSelector`.
187-
kube-scheduler will also look up the label values from the pod and check if those
188-
labels are included in `LabelSelector`. If not, kube-scheduler will take those labels
189-
and AND with `LabelSelector` to identify the group of existing pods over which the
190-
spreading skew will be calculated.
187+
and those key-value labels are ANDed with `LabelSelector` to identify the group of existing pods over
188+
which the spreading skew will be calculated.
189+
kube-scheduler will also handle it if the cluster-level default constraints have the one with `MatchLabelKeys`.
191190

192191
The main case that this new way for identifying pods will enable is constraining
193192
skew spreading calculation to happen at the revision level in Deployments during
@@ -334,7 +333,7 @@ type TopologySpreadConstraint struct {
334333
```
335334

336335
When a Pod is created, kube-apiserver will obtain the labels from the pod
337-
by the keys in `matchLabelKeys` and `key in (value)` is merged to `LabelSelector`
336+
by the keys in `matchLabelKeys` and the key-value labels are merged to `LabelSelector`
338337
of `TopologySpreadConstraint`.
339338

340339
For example, when this sample Pod is created,
@@ -373,15 +372,23 @@ kube-apiserver modifies the `labelSelector` like the following:
373372
- app
374373
```
375374

376-
kube-scheduler will also be aware of `matchLabelKeys` and gracefully handle the same labels.
377-
This is for the Cluster-level default constraints by
378-
`matchLabelKeys: ["pod-template-hash"]`.([#129198](https://github.com/kubernetes/kubernetes/issues/129198))
375+
In addition, kube-scheduler handles `matchLabelKeys` if the cluster-level default constraints is configured with `matchLabelKeys`.
379376

380377
Finally, the feature will be guarded by a new feature flag. If the feature is
381-
disabled, the field `matchLabelKeys` and corresponding`labelSelector` are preserved
378+
disabled, the field `matchLabelKeys` and corresponding `labelSelector` are preserved
382379
if it was already set in the persisted Pod object, otherwise new Pod with the field
383-
creation will be rejected by kube-apiserver; moreover, kube-scheduler will ignore the
384-
field and continue to behave as before.
380+
creation will be rejected by kube-apiserver.
381+
Also kube-scheduler will ignore `matchLabelKeys` in the cluster-level default constraints configuration.
382+
383+
### [v1.33] design change and a safe upgrade path
384+
kube-apiserver will only merge key-value labels corresponding to `matchLabelKeys` into `labelSelector`
385+
at pods creation, and it will not affect the existing pods.
386+
So if there are unscheduled pods with `matchLabelKeys` which are already created when upgrading cluster,
387+
they may not be scheduled correctly after the upgrade.
388+
For a safe upgrade path from v1.32 to v1.33, kube-scheduler would handle not only `matchLabelKeys`
389+
from the default constraints, but also all in-coming pods during v1.33.
390+
We'll change kube-scheduler to only concern `matchLabelKeys` from the default constraints at v1.34 for efficiency,
391+
assuming `matchLabelKeys` of all in-coming pods are handled by kube-apiserver.
385392

386393
### Test Plan
387394

@@ -563,7 +570,7 @@ In the event of an upgrade, kube-apiserver will start to accept and store the fi
563570

564571
In the event of a downgrade, kube-apiserver will reject pod creation with `matchLabelKeys` in `TopologySpreadConstraint`.
565572
But, regarding existing pods, we leave `matchLabelKeys` and generated `LabelSelector` even after downgraded.
566-
kube-scheduler will ignore `MatchLabelKeys` even if it was set.
573+
kube-scheduler will ignore `MatchLabelKeys` if it was set in the cluster-level default constraints configuration.
567574

568575
### Version Skew Strategy
569576

@@ -651,8 +658,7 @@ The feature can be disabled in Alpha and Beta versions by restarting
651658
kube-apiserver and kube-scheduler with feature-gate off.
652659
One caveat is that pods that used the feature will continue to have the
653660
MatchLabelKeys field set and the corresponding LabelSelector even after
654-
disabling the feature gate, however kube-scheduler will not take the MatchLabelKeys
655-
field into account.
661+
disabling the feature gate.
656662
In terms of Stable versions, users can choose to opt-out by not setting
657663
the matchLabelKeys field.
658664

@@ -934,9 +940,7 @@ Think about adding additional work or introducing new steps in between
934940
Yes. there is an additional work:
935941
kube-apiserver uses the keys in `matchLabelKeys` to look up label values from the pod,
936942
and change `LabelSelector` according to them.
937-
kube-scheduler also looks up the label values from the pod and checks if those labels
938-
are included in `LabelSelector`. If not, kube-scheduler will take those labels and AND
939-
with `LabelSelector`.
943+
kube-scheduler also handles matchLabelKeys if the cluster-level default constraints has it.
940944
The impact in the latency of pod creation request in kube-apiserver and kube-scheduler
941945
should be negligible.
942946

@@ -1043,13 +1047,15 @@ revisions of the same workload in scheduler plugin. It's decided not to
10431047
support because of the following reason: scheduler needs to ensure universal
10441048
and scheduler plugin shouldn't have special treatment for any labels/fields.
10451049

1046-
### remove MatchLabelKeys implementation from the scheduler plugin
1047-
Remove this implementation related to `MatchLabelKeys` from the scheduler plugin
1048-
and only kube-apiserver handles `MatchLabelKeys` and updates `LabelSelector`.
1049-
1050-
However, this idea is rejected because:
1051-
- This approach prevents the achievement of the Cluster-level default constraints by `matchLabelKeys: ["pod-template-hash"]`.([#129198](https://github.com/kubernetes/kubernetes/issues/129198)) because kube-apiserver can't be aware of the kube-scheduler configuration.
1052-
- The current implementation of the scheduler plugin is simple, and the risk of increased maintenance overhead is low.
1050+
### implement MatchLabelKeys in only either the scheduler plugin or kube-apiserver
1051+
If the implementation for handling `MatchLabelKeys` exists only in scheduler plugin, merging the key-value labels
1052+
corresponding to `MatchLabelKeys` into `LabelSelector` becomes impossible due to its immutability.
1053+
It may confuse users because this behavior is different from PodAffinity's `MatchLabelKeys`,
1054+
On the other hand, if this implementation exists only in kube-apiserver, it prevents the achievement of the Cluster-level
1055+
default constraints by `matchLabelKeys: ["pod-template-hash"]` ([#129198](https://github.com/kubernetes/kubernetes/issues/129198))
1056+
because kube-apiserver can't be aware of the scheduler configuration.
1057+
In addition, each implementation is simple, and the risk of increased maintenance overhead is low.
1058+
As a result, these ideas are rejected, and we have decided to implement `MatchLabelKeys` in both scheduler plugin and kube-apiserver.
10531059

10541060
## Infrastructure Needed (Optional)
10551061

0 commit comments

Comments
 (0)