|
19 | 19 | - [Risks and Mitigations](#risks-and-mitigations)
|
20 | 20 | - [Design Details](#design-details)
|
21 | 21 | - [Resource States](#resource-states)
|
| 22 | + - [Priority of Resize Requests](#priority-of-resize-requests) |
22 | 23 | - [Kubelet and API Server Interaction](#kubelet-and-api-server-interaction)
|
23 | 24 | - [Kubelet Restart Tolerance](#kubelet-restart-tolerance)
|
24 | 25 | - [Scheduler and API Server Interaction](#scheduler-and-api-server-interaction)
|
@@ -432,6 +433,36 @@ Changes are always propogated through these 4 resource states in order:
|
432 | 433 | Desired --> Allocated --> Actuated --> Actual
|
433 | 434 | ```
|
434 | 435 |
|
| 436 | +### Priority of Resize Requests |
| 437 | + |
| 438 | +Resize requests detected by the kubelet (in `HandlePodUpdates` and `HandlePodAdditions`) |
| 439 | +will be added to a queue of pending resizes. Resize requests will be attempted according to |
| 440 | +the following priority: |
| 441 | + |
| 442 | +1. *Resource requests are not increasing*: Resizes that don't increase requests will be |
| 443 | +prioritized first. These resizes are expected to always succeed and would not be marked as |
| 444 | +pending. |
| 445 | +2. *PriorityClass*: Pods with a higher PriorityClass. |
| 446 | +3. *QoS Class*: Pods with a higher QoS class, where Guaranteed > Burstable. Best effort pods |
| 447 | +do not have CPU or memory resources, so are excluded from the discussion here. |
| 448 | +4. *Time since resize request*: If all else is the same, resizes that have been pending |
| 449 | +longer will be retried first (leveraging LastTransitionTime on the PodResizePending condition). |
| 450 | + |
| 451 | +These priorities are *only* used to indicate which resize requests will be attempted first. |
| 452 | +Scheduler preemption/eviction to make room for pending resizes is not in scope. |
| 453 | + |
| 454 | +A higher priority resize being marked as pending should not block the remaining pending resizes |
| 455 | +from being attempted, i.e. we will try all remaining resizes in the queue even if one is unsuccessful. |
| 456 | +Resizes that are deferred will be added back to the queue to be re-attempted later. Resizes that are |
| 457 | +infeasible may never be retried. |
| 458 | + |
| 459 | +Allocation will be attempted on the pods in the queue: |
| 460 | +- At the end of `HandlePodUpdates`, `HandlePodRemoves`, and `HandlePodCleanups` when a change to the queue is detected. |
| 461 | +- Upon completion of another resize request. |
| 462 | +- Periodically, to catch any cases that we may have missed. |
| 463 | + |
| 464 | +A successful allocation will trigger a pod sync, which will actuate the allocated resize and update the |
| 465 | +pod status accordingly. |
435 | 466 |
|
436 | 467 | ### Kubelet and API Server Interaction
|
437 | 468 |
|
@@ -907,7 +938,6 @@ This will be reconsidered post-beta as a future enhancement.
|
907 | 938 | 1. Explore periodic resyncing of resources. That is, periodically issue resize requests to the
|
908 | 939 | runtime even if the allocated resources haven't changed.
|
909 | 940 | 1. Allow resizing containers with swap allocated.
|
910 |
| -1. Prioritize resizes when resources are freed, or at least make ordering deterministic. |
911 | 941 |
|
912 | 942 | #### Mutable QOS Class "Shape"
|
913 | 943 |
|
|
0 commit comments