You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: kep/594-resourcepolicy/README.md
+52-13Lines changed: 52 additions & 13 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -18,14 +18,21 @@
18
18
- Implementation history
19
19
20
20
## Summary
21
-
This proposal introduces a plugin to allow users to specify the priority of different resources and max resource consumption for workload on differnet resources.
21
+
This proposal introduces a plugin to allow users to specify the priority of different resources and max resource
22
+
consumption for workload on differnet resources.
22
23
23
24
## Motivation
24
-
The machines in a Kubernetes cluster are typically heterogeneous, with varying CPU, memory, GPU, and pricing. To efficiently utilize the different resources available in the cluster, users can set priorities for machines of different types and configure resource allocations for different workloads. Additionally, they may choose to delete pods running on low priority nodes instead of high priority ones.
25
+
The machines in a Kubernetes cluster are typically heterogeneous, with varying CPU, memory, GPU, and pricing. To
26
+
efficiently utilize the different resources available in the cluster, users can set priorities for machines of different
27
+
types and configure resource allocations for different workloads. Additionally, they may choose to delete pods running
28
+
on low priority nodes instead of high priority ones.
25
29
26
30
### Use Cases
27
31
28
-
1. As a user of cloud services, there are some stable but expensive ECS instances and some unstable but cheaper Spot instances in my cluster. I hope that my workload can be deployed first on stable ECS instances, and during business peak periods, the Pods that are scaled out are deployed on Spot instances. At the end of the business peak, the Pods on Spot instances are prioritized to be scaled in.
32
+
1. As a user of cloud services, there are some stable but expensive ECS instances and some unstable but cheaper Spot
33
+
instances in my cluster. I hope that my workload can be deployed first on stable ECS instances, and during business peak
34
+
periods, the Pods that are scaled out are deployed on Spot instances. At the end of the business peak, the Pods on Spot
35
+
instances are prioritized to be scaled in.
29
36
30
37
### Goals
31
38
@@ -35,8 +42,10 @@ The machines in a Kubernetes cluster are typically heterogeneous, with varying C
35
42
36
43
### Non-Goals
37
44
38
-
1. Modify the workload controller to support deletion costs. If the workload don't support deletion costs, scaling in sequence will be random.
39
-
2. When creating a ResourcePolicy, if the number of Pods has already violated the quantity constraint of the ResourcePolicy, we will not attempt to delete the excess Pods.
45
+
1. Modify the workload controller to support deletion costs. If the workload don't support deletion costs, scaling in
46
+
sequence will be random.
47
+
2. When creating a ResourcePolicy, if the number of Pods has already violated the quantity constraint of the
48
+
ResourcePolicy, we will not attempt to delete the excess Pods.
40
49
41
50
42
51
## Proposal
@@ -49,6 +58,12 @@ metadata:
49
58
name: xxx
50
59
namespace: xxx
51
60
spec:
61
+
matchLabelKeys:
62
+
- pod-template-hash
63
+
matchPolicy:
64
+
ignoreTerminatingPod: true
65
+
ignorePreviousPod: false
66
+
forceMaxNum: false
52
67
podSelector:
53
68
matchExpressions:
54
69
- key: key1
@@ -94,8 +109,15 @@ If strategy is `prefer`, the pod can be scheduled on all nodes, these nodes not
94
109
considered after all nodes match the units. So if the strategy is `required`, we will return `unschedulable`
95
110
for those nodes not match the units.
96
111
97
-
### Implementation Details
112
+
`MatchLabelKeys`indicate how we group the pods matched by `podSelector` and `matchPolicy`, its behavior is like
113
+
`MatchLabelKeys`in `PodTopologySpread`.
114
+
115
+
`matchPolicy`indicate if we should ignore some kind pods when calculate pods in certain unit.
98
116
117
+
If `forceMaxNum` is set `true`, we will not try the next units when one unit is not full, this property have no effect
118
+
when `max` is not set in units.
119
+
120
+
### Implementation Details
99
121
100
122
#### Scheduler Plugins
101
123
@@ -114,16 +136,15 @@ Besides, filter will check if the pods that was scheduled on the unit has alread
114
136
If the number of pods has reach the `maxCount`, all the nodes in unit will be marked unschedulable.
115
137
116
138
##### Score
117
-
If `priority` is set in resource policy, we will schedule pod based on `priority`. Default priority is 1, and minimum priority is 1.
139
+
If `priority` is set in resource policy, we will schedule pod based on `priority`. Default priority is 1, and minimum
140
+
priority is 1.
118
141
119
142
Score calculation details:
120
143
121
-
1. calculate priority score, `scorePriority = (priority-1) * 20`, to make sure we give nodes without priority a minimum score.
144
+
1. calculate priority score, `scorePriority = (priority-1) * 20`, to make sure we give nodes without priority a minimum
145
+
score.
122
146
2. normalize score
123
147
124
-
##### PostFilter
125
-
126
-
127
148
#### Resource Policy Controller
128
149
Resource policy controller set deletion cost on pods when the related resource policies were updated or added.
129
150
@@ -138,10 +159,28 @@ Resource policy controller set deletion cost on pods when the related resource p
138
159
139
160
## Graduation criteria
140
161
141
-
## Production Readiness Review Questionnaire
162
+
This plugin will not be enabled only when users enable it in scheduler framework and create a resourcepolicy for pods.
163
+
So it is safe to be beta.
164
+
165
+
* Beta
166
+
- [ ] Add node E2E tests.
167
+
- [ ] Provide beta-level documentation.
142
168
143
169
## Feature enablement and rollback
144
170
145
-
## Implementation history
171
+
Enable resourcepolicy in MultiPointPlugin to enable this plugin, like this:
0 commit comments