AEP-8515: Add support for setting a custom request-to-limit ratio at the VPA object level #8516

iamzili · 2025-09-10T07:20:03Z

What type of PR is this?

/kind documentation
/kind feature

What this PR does / why we need it:

Autoscaling Enhancement Proposal (AEP) to add support for setting a custom request-to-limit ratio at the VPA object level.

I'd love to hear your thoughts on this feature.

#8515

…bject level

k8s-ci-robot · 2025-09-10T07:20:05Z

Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

k8s-ci-robot · 2025-09-10T07:20:13Z

Hi @iamzili. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

adrianmoisey

Thanks for the AEP!
I've left 2 small comments, but will definitely need to come back for a more thorough review.

adrianmoisey · 2025-09-10T07:49:57Z

vertical-pod-autoscaler/enhancements/8515-support-custom-request-to-limit-ratio/README.md

+  * **Factor**: Multiplies the recommended request by a specified value, and the result is set as the new limit.  
+    * Example: if `factor` is set to `2`, the limit will be set to twice the recommended request.  
+  * **Quantity**: Adds a buffer on top of the resource request. This can be expressed either:  
+    * As a **percentage** (`QuantityPercentage`), or  


Would QuantityPercentage not be the same as factor?
ie: QuantityPercentage of 10% is the same as factor of 1.1? (assuming that factor wasn't stored as an int)

Good point - I was thinking the same: simplify things by dropping QuantityPercentage and renaming QuantityValue to Quantity.

When I first thought about this feature, I thought it might be useful to give users more options to define the ratio's magnitude. But this will just make VPA more complex.

I'm also considering changing the way the sub-fields are defined, based on AEP-7862, to make the VPA CRD more consistent. For example:

from:

RequestToLimitRatio: cpu: Type: Factor Value: 2 memory: Type: Quantity Value: 100Mi

to

RequestToLimitRatio: cpu: factor: 2 memory: quantity: 100Mi

btw after reviewing AEP-7862, I think consistency is broken there when startupBoost is defined under containerPolicies:

startupBoost: cpu: type: "Quantity" quantity: "4"

vs

startupBoost: cpu: factor: 1

We should define a common specification to be used for both AEPs.

Yup, I think this change makes sense. And good call out on the inconstancy in the startup boost feature.

Wait, hold on, where is the consistency broken? type default to Factor.
So the second example you pasted implies the following:

startupBoost: cpu: type: "Factor" factor: 1

This is consistent with the other methods, unless I'm missing something.

you're right! I missed the defaulting part from AEP 7863.

But the consistency between the AEP I created and AEP 7863 is still broken because:

from startupBoost:

Type: "Factor" Factor: 1

from RequestToLimitRatio:

Type: "Factor" Value: 1 # do I need to rename the key to "Factor"?

If you ask me, I think the startupBoost spec is more aligned with the VPA object. Since the startupBoost AEP is already merged into the main branch, I assume we need to follow the same approach in the RequestToLimitRatio AEP.

Correct, there should be consistency between this AEP and the startupBoost AEP

consistency fixed

adrianmoisey · 2025-09-10T07:56:12Z

vertical-pod-autoscaler/enhancements/8515-support-custom-request-to-limit-ratio/README.md

+#### When Enabled
+
+* The admission controller will **accept** new VPA objects that include a configured `RequestToLimitRatio`.  
+* For containers targeted by a VPA object using `RequestToLimitRatio`, the admission controller and/or the updater will enforce the configured ratio.


What happens in a scenario like this:

A user has a pod with in-place set. And where requests are equal to limits (and the QoS class is Guaranteed)

The user modifies the VPA to add a RequestToLimitRatio, making limits double requests

What should the behaviour be? Should the VPA kill the pod to allow a new one be created with a different QoS class?

I totally forgot about QoS classes, thanks for bringing this up!

Since the qosClass field is immutable, we cannot rely on in-place updates, the Pod(s) need to be recreated if the QoS class changes.

To clarify, changing the ratio using this new feature will not evict the Pod(s) immediately. It will rely on the current Updater's behavior like when the current resources.requests differ significantly from the new recommendation.

I guess my concern is actually more of "how will the VPA handle the RequestToLimitRatio changing". Either from the default to a specified ratio, or from one ratio to another.
I think it's worth calling this out in the AEP

added more examples

tspearconquest

Thanks for tagging me in Slack to take a look, and for this AEP!
I added a few questions and ideas.

vertical-pod-autoscaler/enhancements/8515-support-custom-request-to-limit-ratio/README.md

tspearconquest · 2025-09-11T01:16:03Z

vertical-pod-autoscaler/enhancements/8515-support-custom-request-to-limit-ratio/README.md

+### Test Plan
+
+* Implement comprehensive unit tests to cover all new functionality.  
+* e2e tests: TODO


Please test extremely high factors like 1 million in addition to more typical factors like 1, 5, 10.

why do we need to test something like this?

We don't necessarily need to given the response above that limits aren't used for scheduling but it might turn up something interesting in how Kubernetes handles it. You never know what bug might lurk with limits that have never been tested before.

@tspearconquest , what do you want to test here? if you wanna test a pod with 1 million factor this is unrelated to the VPA work. In tests we need to make sure the VPA behaves as planned.

what do you want to test here?

Integer overflows on the recommender status fields.

if you wanna test a pod with 1 million factor this is unrelated to the VPA work.

Okay, no problem.

In tests we need to make sure the VPA behaves as planned.

No argument here. I'm probably overthinking this, as I don't know Go code very well and don't have the level of understanding you all do with regards to how the recommender status fields and the VPA pods would work with a value exceeding the int64 limit of ~8 exbibytes when converted to an integer.

Yeah valid points, in the HPA we do the same tests but as a unit tests.
I guess we can make some unit tests around that as well.

vertical-pod-autoscaler/enhancements/8515-support-custom-request-to-limit-ratio/README.md

omerap12

I suggest we first agree on this AEP and then we can mark this for an api-review.

iamzili · 2025-09-12T07:12:51Z

I suggest we first agree on this AEP and then we can mark this for an api-review.

Is this step for the core maintainers, or can I help with it to move things forward somehow?

omerap12 · 2025-09-12T11:17:25Z

I suggest we first agree on this AEP and then we can mark this for an api-review.

Once the maintainers are ok with that we will move this to the api-machinery folks for review.

omerap12 · 2025-09-12T14:21:39Z

vertical-pod-autoscaler/enhancements/8515-support-custom-request-to-limit-ratio/README.md

+### Test Plan
+
+* Implement comprehensive unit tests to cover all new functionality.  
+* e2e tests: TODO


Instead of leaving this as a TODO, please add some coverage here. The E2E test should be straightforward - just verify the limit across a few different scenarios. A couple of simple cases would be enough.

Done, added some e2e tests.

vertical-pod-autoscaler/enhancements/8515-support-custom-request-to-limit-ratio/README.md

adrianmoisey · 2025-09-13T12:16:10Z

/ok-to-test

vertical-pod-autoscaler/enhancements/8515-support-custom-request-to-limit-ratio/README.md

omerap12 · 2025-09-20T17:14:12Z

vertical-pod-autoscaler/enhancements/8515-support-custom-request-to-limit-ratio/README.md

+        RequestToLimitRatio:
+          cpu:
+            type: Factor # this field is optional, if omitted it defaults to "Factor"
+            factor: 2
+          memory:
+            type: Quantity
+            quantity: 200Mi


Why do we need to specify the type if we already have it in the value itself?
meaning - why can't we do:

RequestToLimitRatio: cpu: factor: 2 memory: quantity: 200Mi

Why the "type" is necessary?

Yeah, I was considering dropping the type field entirely, but I followed the specs defined in AEP-7862 to ensure consistency in the VPA CRD. My original spec for RequestToLimitRatio was different.

In AEP-7862, the type defaults to Factor, as in this AEP. That said I agree with you that it might make sense to drop the type completely in this AEP

So like in AEP-7862 we can see the following examples:

resourcePolicy: containerPolicies: - containerName: "boosted-container-name" mode: "Auto" # Vanilla VPA mode + Startup Boost minAllowed: cpu: "250m" memory: "100Mi" maxAllowed: cpu: "500m" memory: "600Mi" # The CPU boosted resources can go beyond maxAllowed. startupBoost: cpu: value: 4

or

resourcePolicy: containerPolicies: - containerName: "boosted-container-name" mode: "StartupBoostOnly" startupBoost: cpu: factor: 2.0

(This was taken from the AEP proposal).
So the factor/value field should be changed, the type field can be dropped.

Where do you see those examples?
Over in https://github.com/kubernetes/autoscaler/tree/master/vertical-pod-autoscaler/enhancements/7862-cpu-startup-boost#api-changes, I see:
If not specified, StartupBoost.CPU.Type defaults to Factor

Something I like about having the Type, is that we could one day update it to be"MaxOf` (or something similar) to select the max of the value or calculated factor.

For example, having a factor of 1.1 for a small 100MiB pod only gives you an extra 10MiB, which is very small.
But may be in the future you want to have a factor of 1.1 and a quantity of 100MiB, that way at the low end you get a 100MiB butter, but at the high end you get 10%

Oh I was looking at the wrong branch ( sorry about that ).
Yeah I agree, having something like MaxOf can be useful indeed - let's keep it that way then :)

omerap12 · 2025-09-21T13:28:27Z

Overall seems ok to me.
@adrianmoisey, feel free to add the /api-review tag.

adrianmoisey · 2025-09-21T13:37:25Z

Agreed!
/api-review

adrianmoisey · 2025-09-21T13:37:45Z

/api review

adrianmoisey · 2025-09-21T13:38:30Z

/label api-review

adrianmoisey · 2025-10-05T07:19:27Z

vertical-pod-autoscaler/enhancements/8515-support-custom-request-to-limit-ratio/README.md

+
+A new `RequestToLimitRatio` field will be added, with the following sub-fields:
+
+* [Optional] `RequestToLimitRatio.CPU.Type` or `RequestToLimitRatio.Memory.Type` (type `string`): Specifies how to apply limits proportionally to the requests. `Type` can have the following values:  


To be in line with CPU Startup Boost feature, this needs to be a required field. See #8608

(Sorry for changing it out from under you)

no problem, updated

k8s-ci-robot · 2025-10-06T09:04:10Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: iamzili
Once this PR has been reviewed and has the lgtm label, please assign gjtempleton for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

vertical-pod-autoscaler/enhancements/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

jackfrancis · 2025-10-06T16:06:22Z

vertical-pod-autoscaler/enhancements/8515-support-custom-request-to-limit-ratio/README.md

+
+If the request-to-limit ratio needs to be updated (for example, because the application's resource usage has changed), users must modify the `resources.requests` or `resources.limits` fields in the workload's API object. Since these fields are immutable, this change results in terminating and recreating the existing Pods.
+
+This proposal introduces a new mechanism that allows VPA users to adjust the request-to-limit ratio directly at the VPA CRD level for an already running workload. This avoids the need to manually update the workload's resource requests and limits, and prevents unnecessary Pod restarts.


I am curious about approaching this proposal from a slightly different perspective:

Lean into the custom proportional scaling configuration part (evolve the current "... are scaled automatically" functionality that is an outcome of setting ContainerControlledValues to RequestsAndLimits.)

in other words, this proposal gives users an option of using ContainerControlledValues=RequestsAndLimits in a non-automatic, user-configurable way.

Don't worry about the "in-place vs recreate pod" challenge.

My thinking is based on the maturation of in-place-pod-resizing (scheduled for GA in 1.35). Is there a reason why we can't rely upon IPPR for handling the in-place objectives of this use case. We should be able to give folks who want an automatic ContainerControlledValues=RequestsAndLimits experience the option to trigger automatic proportionality while modifying the inputs at the original workload API object layer (e.g., pod requests/limits directly), and offer the option of in-place via configuring the workload to use IPPR.

@adrianmoisey @omerap12 @natasha41575 @tallclair do my thoughts above make sense?

while modifying the inputs at the original workload API object layer (e.g., pod requests/limits directly), and offer the option of in-place via configuring the workload to use IPPR

Please correct me if I've misunderstood, but is your suggestion to relax the immutability of pod requests/limits at the workload level and support IPPR for workloads? There's an open issue for that support: kubernetes/kubernetes#132436. I'm a little hesitant to promise this though because of the complexity - if VPA or other top-level controllers can handle the use cases in a simpler way, that would be my preference.

ACK, thank you @natasha41575, that is indeed what I'm suggesting (and hadn't realized that request/limits immutability was out-of-scope for IPPR thus far — my bad).

Given this current state of things, I would suggest that this AEP mention that there is an open issue to address workload request/limits immutability, and that this proposal may offer a temporary bridge for folks who may want to leverage VPA to actually address that use-case.

The tl;dr of my perspective is that I think this would be a more durable description of what we are proposing:

This proposal introduces a new mechanism that allows VPA users to adjust the request-to-limit ratio directly at the VPA CRD level. This mechanism can also be used to to update existing workload + VPA configurations, for example to non-disruptively scale workloads beyond what the their request/limits would predict.

(instead of stating "for an already running workload" as a base case)

Does that make sense?

Add initial AEP for supporting custom request-to-limit ratio at VPA o…

c55f522

…bject level

k8s-ci-robot requested review from kwiesmueller and voelzmo September 10, 2025 07:20

k8s-ci-robot added area/vertical-pod-autoscaler and removed do-not-merge/needs-area labels Sep 10, 2025

iamzili mentioned this pull request Sep 10, 2025

[VPA] Add support for setting a custom request-to-limit ratio at the VPA object level #8515

Open

k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Sep 10, 2025

adrianmoisey reviewed Sep 10, 2025

View reviewed changes

tspearconquest reviewed Sep 11, 2025

View reviewed changes

omerap12 reviewed Sep 11, 2025

View reviewed changes

vertical-pod-autoscaler/enhancements/8515-support-custom-request-to-limit-ratio/README.md Outdated Show resolved Hide resolved

omerap12 reviewed Sep 11, 2025

View reviewed changes

omerap12 reviewed Sep 12, 2025

View reviewed changes

vertical-pod-autoscaler/enhancements/8515-support-custom-request-to-limit-ratio/README.md Outdated Show resolved Hide resolved

k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Sep 13, 2025

adrianmoisey reviewed Sep 13, 2025

View reviewed changes

vertical-pod-autoscaler/enhancements/8515-support-custom-request-to-limit-ratio/README.md Outdated Show resolved Hide resolved

adrianmoisey reviewed Sep 13, 2025

View reviewed changes

vertical-pod-autoscaler/enhancements/8515-support-custom-request-to-limit-ratio/README.md Outdated Show resolved Hide resolved

update API for consistency, add e2e tests, other small updates

093542b

iamzili force-pushed the support-custom-request-to-limit branch from fa84adf to 093542b Compare September 18, 2025 14:14

omerap12 reviewed Sep 20, 2025

View reviewed changes

fix spelling

c1d76eb

k8s-ci-robot added the api-review Categorizes an issue or PR as actively needing an API review. label Sep 21, 2025

github-project-automation bot added this to API Reviews Sep 21, 2025

adrianmoisey reviewed Oct 5, 2025

View reviewed changes

make type field required

3a92965

jackfrancis reviewed Oct 6, 2025

View reviewed changes


		A new `RequestToLimitRatio` field will be added, with the following sub-fields:

		* [Optional] `RequestToLimitRatio.CPU.Type` or `RequestToLimitRatio.Memory.Type` (type `string`): Specifies how to apply limits proportionally to the requests. `Type` can have the following values:


		If the request-to-limit ratio needs to be updated (for example, because the application's resource usage has changed), users must modify the `resources.requests` or `resources.limits` fields in the workload's API object. Since these fields are immutable, this change results in terminating and recreating the existing Pods.

		This proposal introduces a new mechanism that allows VPA users to adjust the request-to-limit ratio directly at the VPA CRD level for an already running workload. This avoids the need to manually update the workload's resource requests and limits, and prevents unnecessary Pod restarts.

AEP-8515: Add support for setting a custom request-to-limit ratio at the VPA object level #8516

Are you sure you want to change the base?

AEP-8515: Add support for setting a custom request-to-limit ratio at the VPA object level #8516

Conversation

iamzili commented Sep 10, 2025

What type of PR is this?

What this PR does / why we need it:

Uh oh!

k8s-ci-robot commented Sep 10, 2025

Uh oh!

k8s-ci-robot commented Sep 10, 2025

Uh oh!

adrianmoisey left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

iamzili Sep 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tspearconquest left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

iamzili Sep 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tspearconquest Sep 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

omerap12 left a comment

Choose a reason for hiding this comment

Uh oh!

iamzili commented Sep 12, 2025

Uh oh!

omerap12 commented Sep 12, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

adrianmoisey commented Sep 13, 2025

iamzili Sep 15, 2025 •

edited

Loading

iamzili Sep 12, 2025 •

edited

Loading

tspearconquest Sep 12, 2025 •

edited

Loading

iamzili Sep 21, 2025 •

edited

Loading