@@ -31,17 +31,19 @@ see-also:
31
31
- [ Story 1] ( #story-1 )
32
32
- [ Story 2] ( #story-2 )
33
33
- [ Implementation Details/Notes/Constraints] ( #implementation-detailsnotesconstraints )
34
- - [ Relationship with SelectorSpreadingPriority] ( #relationship-with-selectorspreadingpriority )
34
+ - [ Feature gate] ( #feature-gate )
35
+ - [ Relationship with " ; DefaultPodTopologySpread" ; plugin] ( #relationship-with-defaultpodtopologyspread-plugin )
35
36
- [ Risks and Mitigations] ( #risks-and-mitigations )
36
37
- [ Design Details] ( #design-details )
37
38
- [ API] ( #api )
38
- - [ Default rules ] ( #default-rules )
39
+ - [ Default constraints ] ( #default-constraints )
39
40
- [ How user stories are addressed] ( #how-user-stories-are-addressed )
40
41
- [ Implementation Details] ( #implementation-details )
41
42
- [ In the metadata/predicates/priorities flow] ( #in-the-metadatapredicatespriorities-flow )
42
43
- [ In the scheduler framework] ( #in-the-scheduler-framework )
43
44
- [ Test Plan] ( #test-plan )
44
45
- [ Graduation Criteria] ( #graduation-criteria )
46
+ - [ Alpha (v1.18):] ( #alpha-v118 )
45
47
- [ Implementation History] ( #implementation-history )
46
48
- [ Alternatives] ( #alternatives )
47
49
<!-- /toc -->
@@ -71,7 +73,8 @@ But if they do, they can still set their own spreading constraints if they have
71
73
72
74
## Motivation
73
75
74
- In order for a workload (pod) to use ` EvenPodsSpreadPriority ` :
76
+ In order for a workload (pod) to use ` .spec.topologySpreadConstraints ` (known as` PodTopologySpread `
77
+ plugin or ` EvenPodsSpreadPriority ` in the old Policy API):
75
78
76
79
1 . Authors have to have an idea of the underlying topology.
77
80
1 . PodSpecs become less portable if their spreading constraints are tailored to a specific topology.
@@ -86,12 +89,12 @@ them suitable to provide default spreading constraints for all workloads in thei
86
89
- Workloads are spread with the default constraints if they belong to the same service, replication controller,
87
90
replica set or stateful set, and if they don't define ` pod.spec.topologySpreadConstraints ` .
88
91
- Provide a k8s default for ` topologySpreadConstraints ` that produces a priority equivalent to
89
- ` SelectorSpreadPriority ` , so that this algorithm can be removed from the default algorithms' provider .
92
+ ` DefaultPodTopologySpread ` , so that this plugin can be deprecated in the future .
90
93
91
94
### Non-Goals
92
95
93
96
- Set defaults for specific namespaces or according to other selectors.
94
- - Removal of ` ServiceSpreadingPriority ` or ` ServiceAntiAffinity ` priorities .
97
+ - Removal of ` DefaultPodTopologySpread ` plugin .
95
98
96
99
## Proposal
97
100
@@ -101,8 +104,8 @@ them suitable to provide default spreading constraints for all workloads in thei
101
104
102
105
As a cluster operator, I want to set default spreading constraints for workloads in the cluster.
103
106
Currently, ` SelectorSpreadPriority ` provides a canned priority that spreads across nodes
104
- and zones (` failure-domain.beta. kubernetes.io/zone` ). However, the nodes in my cluster have
105
- custom topology keys (for physical host, rack, etc.).
107
+ and zones (` topology. kubernetes.io/zone` ). However, the nodes in my cluster have custom topology
108
+ keys (for physical host, rack, etc.).
106
109
107
110
#### Story 2
108
111
@@ -113,87 +116,83 @@ As a workload author, I want to spread the workload in the cluster, but:
113
116
### Implementation Details/Notes/Constraints
114
117
115
118
116
- #### Relationship with SelectorSpreadingPriority
119
+ #### Feature gate
117
120
118
- Note that Default ` topologySpreadConstraints ` has a similar effect to ` SelectorSpreadingPriority ` .
119
- Given that the latter is not configurable, they could return conflicting priorities, which
120
- may not be the intention of an operator. On the other hand, a proper Default for ` topologySpreadConstraints `
121
- could provide the same priority as ` SelectorSpreadingPriority ` . Thus, there's no need for the
122
- features to co-exist.
121
+ Setting a default for ` PodTopologySpread ` will be guarded with the feature gate
122
+ ` DefaultEvenPodsSpread ` . This feature gate will depend on ` EvenPodsSpread ` to also be enabled.
123
123
124
- K8s will set Default ` topologySpreadConstraints ` and remove ` SelectorSpreadingPriority `
125
- from the k8s providers (` DefaultProvider ` and ` ClusterAutoscalerProvider ` ). The set
126
- [ default] ( #default-rules ) will have a similar effect.
124
+ #### Relationship with "DefaultPodTopologySpread" plugin
127
125
128
- If an operator sets a Policy, these are the semantics of the presence of ` SelectorSpreadingPriority `
129
- and/or ` EvenPodsSpreadPriority ` :
126
+ Note that Default ` topologySpreadConstraints ` has a similar effect to ` DefaultPodTopologySpread `
127
+ plugin (` SelectorSpreadingPriority ` when using the Policy API).
128
+ Given that the latter is not configurable, they could return conflicting priorities, which
129
+ may not be the intention of the cluster operator or workload author. On the other hand, a proper
130
+ default for ` topologySpreadConstraints ` could provide the same score as
131
+ ` DefaultPodTopologySpread ` . Thus, there's no need for the features to co-exist.
130
132
131
- | SelectorSpreading | EvenPodsSpread | Operator constraints | Valid | Pod spread constraints |
132
- | :---------------: | :------------: | :------------------: | :---: | :---------------------------: |
133
- | N | Y | Not provided | Yes | [ k8s default] ( #default-rules ) |
134
- | N | Y | Provided | Yes | provided by operator |
135
- | Y | Y | Not provided | Yes | [ k8s default] ( #default-rules ) |
136
- | Y | Y | Provided | No | - |
137
- | N | N | - | Yes | None |
138
- | Y | N | - | No | - |
133
+ When the feature gate is enabled:
139
134
140
- Selecting ` SelectorSpreadingPriority ` but not ` EvenPodsSpreadPriority ` in a policy is an invalid
141
- configuration, because the latter is a requirement for the former.
135
+ - K8s will set Default ` topologySpreadConstraints ` and remove ` DefaultPodTopologySpread ` from the
136
+ k8s providers (` DefaultProvider ` and ` ClusterAutoscalerProvider ` ). The
137
+ [ Default] ( #default-constraints ) will have a similar effect.
138
+ - When a policy is used, ` SelectorSpreadingPriority ` will map to ` PodTopologySpread ` .
139
+ - When setting plugins in the Component Config API, plugins are added as requested. Since an
140
+ operator is manually enabling the plugins, we assume they are aware of their intentions.
142
141
143
142
### Risks and Mitigations
144
143
145
- ` EvenPodsSpreadPriority ` has some overhead and we currently ensure that pods that don't use the
146
- feature get minimally affected. After Default ` topologySpreadConstraints ` is rolled out,
147
- all pods will run through the algorithms .
148
- However, we should ensure that the running overhead is not significantly higher than
149
- ` SelectorSpreadingPriority ` with k8s Default ` topologySpreadConstraints ` .
144
+ The ` PodTopologySpread ` plugin has some overhead compared to other plugins. We currently ensure that
145
+ pods that don't use the feature get minimally affected. After Default ` topologySpreadConstraints `
146
+ is rolled out, all pods will run through the plugin .
147
+ We should ensure that the running overhead is not significantly higher than
148
+ ` DefaultPodTopologySpread ` with the k8s Default.
150
149
151
150
## Design Details
152
151
153
152
### API
154
153
155
- A new structure called ` TopologySpreadConstraint ` is introduced to ` KubeSchedulerConfiguration ` .
154
+ A new structure ` Args ` is introduced in ` pkg/scheduler/framework/plugins/podtopologyspread ` .
155
+ Values are decoded from the ` pluginConfig ` slice in the kube-scheduler Component Config and used in
156
+ ` podtopologyspread.New ` .
156
157
157
158
``` go
158
- type KubeSchedulerConfiguration struct {
159
- ....
160
- // TopologySpreadConstraints defines topology spread constraints to be applied to pods
159
+ // pkg/scheduler/framework/plugins/podtopologyspread/plugin.go
160
+ type Args struct {
161
+ // DefaultConstraints defines topology spread constraints to be applied to pods
161
162
// that don't define any in `pod.spec.topologySpreadConstraints`. Pod selectors must
162
163
// be empty, as they are deduced from the resources that the pod belongs to
163
164
// (includes services, replication controllers, replica sets and stateful sets).
164
165
// If not specified, the scheduler applies the following default constraints:
165
166
// <default rules go here. See next section>
166
167
// +optional
167
- TopologySpreadConstraints []corev1.TopologySpreadConstraint
168
- ....
168
+ DefaultConstraints []corev1.TopologySpreadConstraint
169
169
}
170
170
```
171
171
172
- Note the use of ` k8s.io/api/core/v1.TopologySpreadConstraint ` . During validation,
173
- we verify that selectors are not defined.
172
+ Note the use of ` k8s.io/api/core/v1.TopologySpreadConstraint ` . During validation, we verify that
173
+ selectors are not defined.
174
174
175
- ### Default rules
175
+ ### Default constraints
176
176
177
177
These will be the default constraints for the cluster when the operator doesn't provide any:
178
178
179
179
``` yaml
180
- defaultTopologySpreadConstraints :
180
+ defaultConstraints :
181
181
- maxSkew : 1
182
182
topologyKey : " kubernetes.io/hostname"
183
183
whenUnsatisfiable : ScheduleAnyway
184
184
- maxSkew : 1
185
- topologyKey : " failure-domain.beta .kubernetes.io/zone"
185
+ topologyKey : " topology .kubernetes.io/zone"
186
186
whenUnsatisfiable : ScheduleAnyway
187
187
` ` `
188
188
189
189
### How user stories are addressed
190
190
191
191
Let's say we have a cluster that has a topology based on physical hosts and racks.
192
- Then, an operator can set the following scheduler configuration:
192
+ Then, an operator can set the following configuration for the plugin :
193
193
194
194
` ` ` yaml
195
- apiVersion : componentconfig/v1alpha1
196
- defaultTopologySpreadConstraints :
195
+ defaultConstraints :
197
196
- maxSkew : 5
198
197
topologyKey : " example.com/topology/physical_host"
199
198
whenUnsatisfiable : ScheduleAnyway
@@ -231,14 +230,14 @@ and will be applied at runtime:
231
230
` ` ` yaml
232
231
topologySpreadConstraints:
233
232
- maxSkew: 5
234
- TopologyKey : "example.com/topology/physical_host"
235
- WhenUnsatisfiable : ScheduleAnyway
233
+ topologyKey : "example.com/topology/physical_host"
234
+ whenUnsatisfiable : ScheduleAnyway
236
235
selector:
237
236
matchLabels:
238
237
app: demo
239
238
- maxSkew: 15
240
- TopologyKey : "example.com/topology/rack"
241
- WhenUnsatisfiable : DoNotSchedule
239
+ topologyKey : "example.com/topology/rack"
240
+ whenUnsatisfiable : DoNotSchedule
242
241
selector:
243
242
matchLabels:
244
243
app: demo
@@ -273,21 +272,19 @@ To ensure this feature to be rolled out in high quality. Following tests are man
273
272
274
273
# ## Graduation Criteria
275
274
276
- Alpha (v1.17 ) :
275
+ # ### Alpha (v1.18 ):
277
276
278
- - [ ] Scheduler Component Config API changes.
279
- - [ ] Default, validation and generated code.
280
- - [ ] Priority Implementation.
281
- - [ ] Predicate implementation.
277
+ - [ ] Args struct for `podtopologyspread.New`.
278
+ - [ ] Defaults and validation.
279
+ - [ ] Score extension point implementation.
280
+ - [ ] Filter extension point implementation.
281
+ - [ ] Disabling `DefaultPodTopologySpread` when the feature is enabled.
282
282
- [ ] Test cases mentioned in the [Test Plan](#test-plan).
283
283
284
- Default `topologySpreadConstaints` will graduate from Alpha along with
285
- [Even Pods Spreading](/keps/sig-scheduling/20190221-even-pods-spreading.md).
286
- They will share the same feature gate.
287
-
288
284
# # Implementation History
289
285
290
286
- 2019-09-26 : Initial KEP sent out for review.
287
+ - 2020-01-20 : KEP updated to make use of framework's PluginConfig.
291
288
292
289
# # Alternatives
293
290
0 commit comments