Skip to content

Commit 69be606

Browse files
authored
explaining the interactions of topology spread constraints and node affinity/selector (#29632)
* explaining the interactions of topology spread constraints and node affinity/selector Signed-off-by: RinkiyaKeDad <[email protected]> * udpates from code review Signed-off-by: RinkiyaKeDad <[email protected]> * more updated from code reviews Signed-off-by: RinkiyaKeDad <[email protected]>
1 parent c2f0ae3 commit 69be606

File tree

1 file changed

+17
-13
lines changed

1 file changed

+17
-13
lines changed

content/en/docs/concepts/workloads/pods/pod-topology-spread-constraints.md

Lines changed: 17 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -230,20 +230,9 @@ If you apply "two-constraints.yaml" to this cluster, you will notice "mypod" sta
230230

231231
To overcome this situation, you can either increase the `maxSkew` or modify one of the constraints to use `whenUnsatisfiable: ScheduleAnyway`.
232232

233-
### Conventions
233+
### Interaction With Node Affinity and Node Selectors
234234

235-
There are some implicit conventions worth noting here:
236-
237-
- Only the Pods holding the same namespace as the incoming Pod can be matching candidates.
238-
239-
- Nodes without `topologySpreadConstraints[*].topologyKey` present will be bypassed. It implies that:
240-
241-
1. the Pods located on those nodes do not impact `maxSkew` calculation - in the above example, suppose "node1" does not have label "zone", then the 2 Pods will be disregarded, hence the incoming Pod will be scheduled into "zoneA".
242-
2. the incoming Pod has no chances to be scheduled onto this kind of nodes - in the above example, suppose a "node5" carrying label `{zone-typo: zoneC}` joins the cluster, it will be bypassed due to the absence of label key "zone".
243-
244-
- Be aware of what will happen if the incomingPod's `topologySpreadConstraints[*].labelSelector` doesn't match its own labels. In the above example, if we remove the incoming Pod's labels, it can still be placed onto "zoneB" since the constraints are still satisfied. However, after the placement, the degree of imbalance of the cluster remains unchanged - it's still zoneA having 2 Pods which hold label {foo:bar}, and zoneB having 1 Pod which holds label {foo:bar}. So if this is not what you expect, we recommend the workload's `topologySpreadConstraints[*].labelSelector` to match its own labels.
245-
246-
- If the incoming Pod has `spec.nodeSelector` or `spec.affinity.nodeAffinity` defined, nodes not matching them will be bypassed.
235+
The scheduler will skip the non-matching nodes from the skew calculations if the incoming Pod has `spec.nodeSelector` or `spec.affinity.nodeAffinity` defined.
247236

248237
Suppose you have a 5-node cluster ranging from zoneA to zoneC:
249238

@@ -283,6 +272,21 @@ There are some implicit conventions worth noting here:
283272

284273
{{< codenew file="pods/topology-spread-constraints/one-constraint-with-nodeaffinity.yaml" >}}
285274

275+
The scheduler doesn't have prior knowledge of all the zones or other topology domains that a cluster has. They are determined from the existing nodes in the cluster. This could lead to a problem in autoscaled clusters, when a node pool (or node group) is scaled to zero nodes and the user is expecting them to scale up, because, in this case, those topology domains won't be considered until there is at least one node in them.
276+
277+
### Other Noticeable Semantics
278+
279+
There are some implicit conventions worth noting here:
280+
281+
- Only the Pods holding the same namespace as the incoming Pod can be matching candidates.
282+
283+
- The scheduler will bypass the nodes without `topologySpreadConstraints[*].topologyKey` present. This implies that:
284+
285+
1. the Pods located on those nodes do not impact `maxSkew` calculation - in the above example, suppose "node1" does not have label "zone", then the 2 Pods will be disregarded, hence the incoming Pod will be scheduled into "zoneA".
286+
2. the incoming Pod has no chances to be scheduled onto this kind of nodes - in the above example, suppose a "node5" carrying label `{zone-typo: zoneC}` joins the cluster, it will be bypassed due to the absence of label key "zone".
287+
288+
- Be aware of what will happen if the incomingPod's `topologySpreadConstraints[*].labelSelector` doesn't match its own labels. In the above example, if we remove the incoming Pod's labels, it can still be placed onto "zoneB" since the constraints are still satisfied. However, after the placement, the degree of imbalance of the cluster remains unchanged - it's still zoneA having 2 Pods which hold label {foo:bar}, and zoneB having 1 Pod which holds label {foo:bar}. So if this is not what you expect, we recommend the workload's `topologySpreadConstraints[*].labelSelector` to match its own labels.
289+
286290
### Cluster-level default constraints
287291

288292
It is possible to set default topology spread constraints for a cluster. Default

0 commit comments

Comments
 (0)