Skip to content

Commit c872c41

Browse files
authored
Merge pull request kubernetes#1993 from alculquicondor/default-spread
Implementation details for Default Pod Topology Spread
2 parents 37b65e1 + 2e67bb8 commit c872c41

File tree

2 files changed

+40
-43
lines changed

2 files changed

+40
-43
lines changed

keps/sig-scheduling/1258-default-even-pod-spreading/README.md renamed to keps/sig-scheduling/1258-default-pod-topology-spread/README.md

Lines changed: 37 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# Default Even Pod Spreading
1+
# Default Pod Topology Spread
22

33
## Table of Contents
44

@@ -14,15 +14,13 @@
1414
- [Story 2](#story-2)
1515
- [Implementation Details/Notes/Constraints](#implementation-detailsnotesconstraints)
1616
- [Feature gate](#feature-gate)
17-
- [Relationship with "DefaultPodTopologySpread" plugin](#relationship-with-defaultpodtopologyspread-plugin)
17+
- [Relationship with "SelectorSpread" plugin](#relationship-with-selectorspread-plugin)
1818
- [Risks and Mitigations](#risks-and-mitigations)
1919
- [Design Details](#design-details)
2020
- [API](#api)
2121
- [Default constraints](#default-constraints)
2222
- [How user stories are addressed](#how-user-stories-are-addressed)
2323
- [Implementation Details](#implementation-details)
24-
- [In the metadata/predicates/priorities flow](#in-the-metadatapredicatespriorities-flow)
25-
- [In the scheduler framework](#in-the-scheduler-framework)
2624
- [Test Plan](#test-plan)
2725
- [Graduation Criteria](#graduation-criteria)
2826
- [Alpha (v1.19):](#alpha-v119)
@@ -38,12 +36,12 @@
3836
- [ ] Test plan is in place, giving consideration to SIG Architecture and SIG Testing input
3937
- [ ] Graduation criteria is in place
4038
- [ ] "Implementation History" section is up-to-date for milestone
41-
- [ ] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io]
42-
- [ ] Supporting documentation e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes
39+
- [x] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io]
40+
- [x] Supporting documentation e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes
4341

4442
## Summary
4543

46-
With [Even Pods Spreading](/keps/sig-scheduling/20190221-even-pods-spreading.md),
44+
With [Pod Topology Spread](/keps/sig-scheduling/895-pod-topology-spread),
4745
workload authors can define spreading rules for their loads based on the topology of the clusters.
4846
The spreading rules are defined in the `PodSpec`, thus they are tied to the pod.
4947

@@ -76,7 +74,7 @@ them suitable to provide default spreading constraints for all workloads in thei
7674
### Non-Goals
7775

7876
- Set defaults for specific namespaces or according to other selectors.
79-
- Removal of `DefaultPodTopologySpread` plugin.
77+
- Removal of `SelectorSpread` plugin.
8078

8179
## Proposal
8280

@@ -85,7 +83,7 @@ them suitable to provide default spreading constraints for all workloads in thei
8583
#### Story 1
8684

8785
As a cluster operator, I want to set default spreading constraints for workloads in the cluster.
88-
Currently, `SelectorSpreadPriority` provides a canned priority that spreads across nodes
86+
Currently, `SelectorSpread` plugin provides a canned scoring that spreads across nodes
8987
and zones (`topology.kubernetes.io/zone`). However, the nodes in my cluster have custom topology
9088
keys (for physical host, rack, etc.).
9189

@@ -101,45 +99,45 @@ As a workload author, I want to spread the workload in the cluster, but:
10199
#### Feature gate
102100

103101
Setting a default for `PodTopologySpread` will be guarded with the feature gate
104-
`DefaultEvenPodsSpread`. This feature gate will depend on `EvenPodsSpread` to also be enabled.
102+
`DefaultPodTopologySpread`.
105103

106-
#### Relationship with "DefaultPodTopologySpread" plugin
104+
#### Relationship with "SelectorSpread" plugin
107105

108-
Note that Default `topologySpreadConstraints` has a similar effect to `DefaultPodTopologySpread`
106+
Note that Default `topologySpreadConstraints` has a similar effect to `SelectorSpread`
109107
plugin (`SelectorSpreadingPriority` when using the Policy API).
110108
Given that the latter is not configurable, they could return conflicting priorities, which
111109
may not be the intention of the cluster operator or workload author. On the other hand, a proper
112-
default for `topologySpreadConstraints` could provide the same score as
113-
`DefaultPodTopologySpread`. Thus, there's no need for the features to co-exist.
110+
default for `topologySpreadConstraints` can provide the same score as
111+
`SelectorSpread`. Thus, there's no need for the features to co-exist.
114112

115113
When the feature gate is enabled:
116114

117-
- K8s will set Default `topologySpreadConstraints` and remove `DefaultPodTopologySpread` from the
115+
- K8s will set Default `topologySpreadConstraints` and remove `SelectorSpread` from the
118116
k8s providers (`DefaultProvider` and `ClusterAutoscalerProvider`). The
119-
[Default](#default-constraints) will have a similar effect.
120-
- When a policy is used, `SelectorSpreadingPriority` will map to `PodTopologySpread`.
121-
- When setting plugins in the Component Config API, plugins are added as requested. Since an
122-
operator is manually enabling the plugins, we assume they are aware of their intentions.
117+
[Default constraints](#default-constraints) will produce a similar score.
118+
- When setting plugins in the Component Config API, operators can specify plugins they want to enable.
119+
Since this is a manual operation, if an operator decides to enable both plugins, this is respected.
120+
- [Beta] When using the Policy API, `SelectorSpreadingPriority` will map to `PodTopologySpread`.
123121

124122
### Risks and Mitigations
125123

126124
The `PodTopologySpread` plugin has some overhead compared to other plugins. We currently ensure that
127125
pods that don't use the feature get minimally affected. After Default `topologySpreadConstraints`
128126
is rolled out, all pods will run through the plugin.
129127
We should ensure that the running overhead is not significantly higher than
130-
`DefaultPodTopologySpread` with the k8s Default.
128+
`SelectorSpread` with the k8s Default.
131129

132130
## Design Details
133131

134132
### API
135133

136-
A new structure `Args` is introduced in `pkg/scheduler/framework/plugins/podtopologyspread`.
134+
A new structure `PodTopologySpreadArgs` is introduced in `pkg/scheduler/apis/config/`.
137135
Values are decoded from the `pluginConfig` slice in the kube-scheduler Component Config and used in
138136
`podtopologyspread.New`.
139137

140138
```go
141-
// pkg/scheduler/framework/plugins/podtopologyspread/plugin.go
142-
type Args struct {
139+
// pkg/scheduler/apis/config/types_pluginargs.go
140+
type PodTopologySpreadArgs struct {
143141
// DefaultConstraints defines topology spread constraints to be applied to pods
144142
// that don't define any in `pod.spec.topologySpreadConstraints`. Pod selectors must
145143
// be empty, as they are deduced from the resources that the pod belongs to
@@ -168,6 +166,12 @@ defaultConstraints:
168166
whenUnsatisfiable: ScheduleAnyway
169167
```
170168
169+
An operator can choose to disable the default constraints using:
170+
171+
```yaml
172+
defaultConstraints: []
173+
```
174+
171175
### How user stories are addressed
172176
173177
Let's say we have a cluster that has a topology based on physical hosts and racks.
@@ -225,21 +229,15 @@ topologySpreadConstraints:
225229
app: demo
226230
```
227231

228-
Please note that these constraints are honored internally in the scheduler, but they are NOT
232+
Please note that these constraints get applied internally in the scheduler, but they are NOT
229233
persisted in the PodSpec via API Server.
230234

231235
### Implementation Details
232236

233-
#### In the metadata/predicates/priorities flow
234-
235-
1. Calculate the spreading constraints for the pod as part of the metadata calculation.
236-
Use the constraints provided by the pod or calculate the default ones if they don't provide any.
237-
1. When running the predicates or priorities, use the constraints stored in the metadata.
238-
239-
#### In the scheduler framework
240-
241237
1. Calculate spreading constraints for the pod in the `PreFilter` extension point. Store them
242-
in the `PluginContext`.
238+
in the `PluginContext`. The constraints are obtained from `.spec.topologySpreadConstraints`. If
239+
they are not defined, a default is calculated from the plugin's default constraints, using the
240+
selectors of the Services, ReplicaSets, StatefulSets or ReplicationControllers the pod belongs to.
243241
1. In the `Filter` and `Score` extension points, use the stored spreading constraints instead of
244242
the ones defined by the pod.
245243

@@ -257,11 +255,11 @@ To ensure this feature to be rolled out in high quality. Following tests are man
257255
#### Alpha (v1.19):
258256

259257
- [x] Args struct for `podtopologyspread.New`.
260-
- [ ] Defaults and validation.
261-
- [ ] Score extension point implementation. Add support for `maxSkew`.
258+
- [x] Defaults and validation.
259+
- [x] Score extension point implementation. Add support for `maxSkew`.
262260
- [x] Filter extension point implementation.
263-
- [ ] Disabling `DefaultPodTopologySpread` when the feature is enabled.
264-
- [ ] Test cases mentioned in the [Test Plan](#test-plan).
261+
- [x] Disabling `SelectorSpread` when the feature is enabled.
262+
- [x] Unit, Integration and benchmark test cases mentioned in the [Test Plan](#test-plan).
265263

266264
## Implementation History
267265

@@ -271,7 +269,7 @@ To ensure this feature to be rolled out in high quality. Following tests are man
271269

272270
## Alternatives
273271

274-
- Make the topology keys used in `SelectorSpreadingPriority` configurable.
272+
- Make the topology keys used in `SelectorSpread` configurable.
275273

276274
While this moves the scheduler in the right direction, there are two problems:
277275

@@ -282,5 +280,4 @@ To ensure this feature to be rolled out in high quality. Following tests are man
282280

283281
This approach would likely allow us to provide a more flexible interface that
284282
can set defaults for specific namespaces or with other selectors. However, that
285-
wouldn't allow us to replace `SelectorSpreadingPriority` with
286-
`EvenPodsSpreading`.
283+
wouldn't allow us to replace `SelectorSpread` with `PodTopologySpread`.

keps/sig-scheduling/1258-default-even-pod-spreading/kep.yaml renamed to keps/sig-scheduling/1258-default-pod-topology-spread/kep.yaml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
title: Default Even Pod Spreading
1+
title: Default Pod Topology Spread
22
kep-number: 1258
33
authors:
44
- "@alculquicondor"
@@ -12,15 +12,15 @@ approvers:
1212
- "@ahg-g"
1313
- "@Huang-Wei"
1414
see-also:
15-
- "/keps/sig-scheduling/20190221-even-pods-spreading.md"
15+
- "/keps/sig-scheduling/895-pod-topology-spread"
1616
stage: alpha
1717
latest-milestone: "v1.19"
1818
milestone:
1919
alpha: "v1.19"
2020
beta: "v1.20"
2121
stable: "v1.22"
2222
feature-gates:
23-
- name: DefaultEvenPodsSpread
23+
- name: DefaultPodTopologySpread
2424
components:
2525
- kube-scheduler
2626
disable-supported: true

0 commit comments

Comments
 (0)