Skip to content

Commit 2f5c139

Browse files
committed
KEP-1287: Don't allow in-place memory limit decreases
1 parent 37ac1ab commit 2f5c139

File tree

1 file changed

+10
-16
lines changed
  • keps/sig-node/1287-in-place-update-pod-resources

1 file changed

+10
-16
lines changed

keps/sig-node/1287-in-place-update-pod-resources/README.md

Lines changed: 10 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -773,22 +773,15 @@ pod status resources, but not otherwise acted upon.
773773

774774
### Memory Limit Decreases
775775

776-
Setting the memory limit below current memory usage can cause problems. With cgroups v1 the change
777-
will simply be rejected by the kernel, whereas with cgroups v2 it will trigger an oom-kill.
778-
779-
To avoid this situation, when downsizing container memory limits the Kubelet will first check the
780-
usage via the CRI `ContainerStats` (or maybe `ListContainerStats`) call. This check will be
781-
performed both at resource allocation time, and right before actuating the resize. If the
782-
allocation-time check fails, the resize will be deferred. If the actuation-time check fails, the
783-
resize will be skipped until the next pod sync, an event will report the error, and the resize
784-
status will be set to `Error`. Even with these protections, there is still the possibility of a
785-
time-of-check-time-of-use race, so the possibility of oom-kill will be documented, and caution
786-
recommended.
787-
788-
If a memory limit decrease fails at actuation time, other resources and containers will continue to
789-
be resized, but the pod-level memory limit will not be decreased until all container limits have
790-
been successfully adjusted. For guaranteed pods, in the case the limit decrease fails, the memory
791-
request will be set to the original limit in the pod status.
776+
Setting the memory limit below current memory usage can cause problems. If the kernel cannot reclaim
777+
sufficient memory, the outcome depends on the cgroups version. With cgroups v1 the change will
778+
simply be rejected by the kernel, whereas with cgroups v2 it will trigger an oom-kill.
779+
780+
In the initial beta release of in-place resize, we will **disallow** `PreferNoRestart` memory limit
781+
decreases, enforced through API validation. The intent is for this restriction to be relaxed in the
782+
future, but the design of how limit decreases will be approached is still undecided.
783+
784+
Memory limit decreases with `RestartRequired` are still allowed.
792785

793786
### Sidecars
794787

@@ -857,6 +850,7 @@ This will be reconsidered post-beta as a future enhancement.
857850

858851
### Future Enhancements
859852

853+
1. Allow memory limits to be decreased, and handle the case where limits are set below usage.
860854
1. Kubelet (or Scheduler) evicts lower priority Pods from Node to make room for
861855
resize. Pre-emption by Kubelet may be simpler and offer lower latencies.
862856
1. Allow ResizePolicy to be set on Pod level, acting as default if (some of)

0 commit comments

Comments
 (0)