Skip to content

Commit 4e1ca58

Browse files
committed
KEP-1287: AllocatedResources and scheduler changes
1 parent be05551 commit 4e1ca58

File tree

1 file changed

+37
-16
lines changed
  • keps/sig-node/1287-in-place-update-pod-resources

1 file changed

+37
-16
lines changed

keps/sig-node/1287-in-place-update-pod-resources/README.md

Lines changed: 37 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -216,6 +216,8 @@ PodStatus is extended to show the resources applied to the Pod and its Container
216216
* Pod.Status.ContainerStatuses[i].Resources (new field, type
217217
v1.ResourceRequirements) shows the **actual** resources held by the Pod and
218218
its Containers for running containers, and the allocated resources for non-running containers.
219+
* Pod.Status.AllocatedResources (new field) reports the aggregate pod-level allocated resources,
220+
computed from the container-level allocated resources.
219221
* Pod.Status.Resize (new field, type map[string]string) explains what is
220222
happening for a given resource on a given container.
221223

@@ -232,17 +234,38 @@ Additionally, a new `Pod.Spec.Containers[i].ResizePolicy[]` field (type
232234

233235
When the Kubelet admits a pod initially or admits a resize, all resource requirements from the spec
234236
are cached and checkpointed locally. When a container is (re)started, these are the requests and
235-
limits used.
237+
limits used. The allocated resources are only reported in the API at the pod-level, through the
238+
`Pod.Status.AllocatedResources` field.
239+
240+
```
241+
type PodStatus struct {
242+
// ...
243+
244+
// AllocatedResources is the pod-level allocated resources. Only allocated requests are included.
245+
// +optional
246+
AllocatedResources *PodAllocatedResources `json:"allocatedResources,omitempty"`
247+
}
248+
249+
// PodAllocatedResources is used for reporting pod-level allocated resources.
250+
type PodAllocatedResources struct {
251+
// Requests is the pod-level allocated resource requests, either directly
252+
// from the pod-level resource requirements if specified, or computed from
253+
// the total container allocated requests.
254+
// +optional
255+
Requests v1.ResourceList
256+
}
257+
258+
```
236259

237260
The alpha implementation of In-Place Pod Vertical Scaling included `AllocatedResources` in the
238261
container status, but only included requests. This field will remain in alpha, guarded by the
239262
separate `InPlacePodVerticalScalingAllocatedStatus` feature gate, and is a candidate for future
240263
removal. With the allocated status feature gate enabled, Kubelet will continue to populate the field
241264
with the allocated requests from the checkpoint.
242265

243-
The scheduler uses `max(spec...resources, status...resources)` for fit decisions, but since the
244-
actual resources are only relevant and reported for running containers, the Kubelet sets
245-
`status...resources` equal to the allocated resources for non-running containers.
266+
The scheduler uses `max(spec...resources, status.allocatedResources, status...resources)` for fit
267+
decisions, but since the actual resources are only relevant and reported for running containers, the
268+
Kubelet sets `status...resources` equal to the allocated resources for non-running containers.
246269

247270
See [`Alternatives: Allocated Resources`](#allocated-resources-1) for alternative APIs considered.
248271

@@ -464,15 +487,12 @@ added. This ensures that resizes don't affect previously admitted existing Pods.
464487

465488
Scheduler continues to use Pod's Spec.Containers[i].Resources.Requests for
466489
scheduling new Pods, and continues to watch Pod updates, and updates its cache.
467-
To compute the Node resources allocated to Pods, it must consider pending
468-
resizes, as described by Status.Resize.
469-
470-
For containers which have `Status.Resize = "Infeasible"`, it can
471-
simply use `Status.ContainerStatus[i].Resources`.
472490

473-
For containers which have a pending resize, it must be pessimistic and use
474-
the larger of the Pod's `Spec...Resources.Requests` and
475-
`Status...Resources.Requests` values.
491+
To compute the Node resources allocated to Pods, pending resizes must be factored in.
492+
The scheduler will use the maximum of:
493+
1. Desired resources, computed from container requests in the pod spec, unless the resize is marked as `Infeasible`
494+
1. Actual resources, computed from the `.status.containerStatuses[i].resources.requests`
495+
1. Allocated resources, reported in `.status.allocatedResources.requests`
476496

477497
### Flow Control
478498

@@ -772,11 +792,12 @@ resize](#design-sketch-workload-resource-resize).
772792
### Resource Quota
773793

774794
With InPlacePodVerticalScaling enabled, resource quota needs to consider pending resizes. Similarly
775-
to how this is handled by scheduling, resource quota will use the maximum of
776-
`.spec.container[*].resources.requests` and `.status.containerStatuses[*].resources` to
777-
determine the effective request values.
795+
to how this is handled by scheduling, resource quota will use the maximum of:
796+
1. Desired resources, computed from container requests in the pod spec, unless the resize is marked as `Infeasible`
797+
1. Actual resources, computed from the `.status.containerStatuses[i].resources.requests`
798+
1. Allocated resources, reported in `.status.allocatedResources.requests`
778799

779-
To properly handle scale-down, this means that the resource quota controller now needs to evaluate
800+
To properly handle scale-down, resource quota controller now needs to evaluate
780801
pod updates where `.status...resources` changed.
781802

782803
### Affected Components

0 commit comments

Comments
 (0)