Skip to content

Commit 093542b

Browse files
committed
update API for consistency, add e2e tests, other small updates
1 parent c55f522 commit 093542b

File tree

1 file changed

+49
-35
lines changed
  • vertical-pod-autoscaler/enhancements/8515-support-custom-request-to-limit-ratio

1 file changed

+49
-35
lines changed

vertical-pod-autoscaler/enhancements/support-custom-request-to-limit-ratio/README.md renamed to vertical-pod-autoscaler/enhancements/8515-support-custom-request-to-limit-ratio/README.md

Lines changed: 49 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# AEP-XXX: Add support for setting a custom request-to-limit ratio at the VPA object level
1+
# AEP-8515: Add support for setting a custom request-to-limit ratio at the VPA object level
22

33
<!-- toc -->
44
- [Summary](#summary)
@@ -19,6 +19,7 @@
1919
- [When Disabled](#when-disabled)
2020
- [Kubernetes Version Compatibility](#kubernetes-version-compatibility)
2121
- [Test Plan](#test-plan)
22+
- [E2E](#e2e)
2223
- [Examples](#examples)
2324
- [Implementation History](#implementation-history)
2425
<!-- /toc -->
@@ -31,22 +32,22 @@ If the request-to-limit ratio needs to be updated (for example, because the appl
3132

3233
This proposal introduces a new mechanism that allows VPA users to adjust the request-to-limit ratio directly at the VPA CRD level for an already running workload. This avoids the need to manually update the workload's resource requests and limits, and prevents unnecessary Pod restarts.
3334

34-
The feature is gated by a new alpha feature flag, `RequestToLimitRatio`, which is disabled by default.
35+
The feature is gated by a new feature gate, `RequestToLimitRatio`, which is disabled by default in alpha.
3536

3637
## Goals
3738

3839
* Provide a feature gate to enable or disable the feature (`RequestToLimitRatio`).
3940
* Allow VPA to update the request-to-limit ratio of a Pod's containers during Pod recreation or in-place updates.
4041
* Introduce a new `RequestToLimitRatio` block that enables users to adjust the request-to-limit ratio in the following ways:
41-
* **Factor**: Multiplies the recommended request by a specified value, and the result is set as the new limit.
42-
* Example: if `factor` is set to `2`, the limit will be set to twice the recommended request.
43-
* **Quantity**: Adds a buffer on top of the resource request. This can be expressed either:
44-
* As a **percentage** (`QuantityPercentage`), or
45-
* As an **absolute value with units** (`QuantityValue`).
42+
* **Factor**: Multiplies the recommended request by a specified value, and the result is set as the new limit, for example:
43+
* If the value for `Factor` is set to `2`, the limit will be twice the recommended request.
44+
* If the value for `Factor` is set to `1.1`, the limit will be 10% higher than the recommended request.
45+
* **Quantity**: Adds a buffer on top of the resource request. This can be expressed as an **absolute value with units** (e.g. `100Mi`, `10m`).
4646

4747
## Non-Goals
4848

4949
* This proposal does not change the core VPA algorithm or its decision-making process for when to apply the recommended values or set limits proportionally.
50+
* This proposal does not change the default request-to-limit behavior when the feature flag is enabled. Pods managed by VPA objects that do not use the new `RequestToLimitRatio` field will continue to follow the existing behavior. For details, see the [Behavior](#behavior) section.
5051

5152
## Proposal
5253

@@ -61,20 +62,23 @@ Some examples of the VPA CRD using the new `RequestToLimitRatio` field are provi
6162

6263
A new `RequestToLimitRatio` field will be added, with the following sub-fields:
6364

64-
* `RequestToLimitRatio.CPU.Type` or `RequestToLimitRatio.Memory.Type` (type: `string`, required): Specifies how to apply limits proportionally to the requests. `Type` can have the following values:
65-
* `Factor` (type: `integer`): Interpreted as a multiplier for the recommended request.
65+
* [Optional] `RequestToLimitRatio.CPU.Type` or `RequestToLimitRatio.Memory.Type` (type `string`): Specifies how to apply limits proportionally to the requests. `Type` can have the following values:
66+
* `Factor`: Interpreted as a multiplier for the recommended request.
6667
* Example: a value of `2` will double the limits.
67-
* `QuantityValue` (type: `string`): Adds an absolute value on top of the requests to determine the new limit.
68-
* Example: for memory, a value of `100Mi` means the new limit will be: calculated Memory request + `100Mi`.
69-
* `QuantityPercentage` (type: `integer`): Increases the limit by the specified percentage of the resource request.
70-
* Example: if the request is 1000m CPU and the percentage is 20, the limit will be 1000m + (20% of 1000m) = 1200m.
68+
* `Quantity`: Adds an absolute value on top of the requests to determine the new limit.
69+
* Example: for memory, a value of `100Mi` means the new limit will be: calculated memory request + `100Mi`.
70+
* If `RequestToLimitRatio.CPU.Type` or `RequestToLimitRatio.Memory.Type` is not specified, the default value is `Factor`.
7171

72-
* `RequestToLimitRatio.CPU.Value` (type: `string`, required): Specifies the magnitude of the ratio between request and limit, interpreted according to `RequestToLimitRatio.CPU.Type`:
73-
* If `Type` is `Factor`: a value of `3` will triple the CPU limits.
74-
* If `Type` is `QuantityValue`: if the value is set to 200m, then the CPU limit will be set to the CPU request plus 200 millicores.
75-
* If `Type` is `QuantityPercentage`: a value of `20` increases the CPU limit by 20% of the calculated request.
72+
* [Optional] `RequestToLimitRatio.CPU.Factor` (type `float`): The factor to apply to the CPU request.
73+
* If `Type` is `Factor` a value of `3` will triple the CPU limits.
74+
* If `Type` is `Quantity`, this field is not allowed.
7675

77-
* `RequestToLimitRatio.Memory.Value` (type: `string`): Similar to `CPU.Value`, except that for `QuantityValue` the units are memory-based (e.g., `Mi`, `Gi`) rather than CPU millicores (`m`).
76+
* [Optional] `RequestToLimitRatio.CPU.Quantity` (type `Quantity`): The value specified in this field is added to the request to calculate the new limit.
77+
* If `Type` is `Factor`, this field is not allowed.
78+
* If `Type` is `Quantity` a CPU resource quantity added. For example, if the value is `200m`, the CPU limit will be calculated as the CPU request plus 200 millicores.
79+
80+
* [Optional] `RequestToLimitRatio.Memory.Factor` (type `float`): Same as `CPU.Factor`.
81+
* [Optional] `RequestToLimitRatio.Memory.Quantity` (type `Quantity`): Similar to `CPU.Quantity` except that for `Quantity` the units are memory-based (e.g., `Mi`, `Gi`) rather than CPU millicores (`m`).
7882

7983
### Behavior
8084

@@ -101,10 +105,8 @@ The behavior after implementing this feature is as follows:
101105
* **Recreate mode**: When a new request-to-limit ratio is set, the ratio is applied only on Pod creation, after the Updater evicts the running Pod. In this mode, updating the request-to-limit ratio on a running Pod will affect the limits only after the Pod is evicted (either by the Updater or manually, e.g. via `kubectl delete pod`) when the current `resources.requests` differ significantly from the new recommendation.
102106
* **InPlaceOrRecreate mode** (alpha in v1.4.0): When a new request-to-limit ratio is set, the VPA Updater will attempt in-place updates using the `/resize` subresource to modify `Pod.Spec.Containers[i].Resources.limits` or `Pod.Spec.Containers[i].Resources.requests` in certain situations. If the in-place update fails, it falls back to evicting the Pod and performing a recreation. For more details, see the [In-Place Updates documentation](https://github.com/kubernetes/autoscaler/blob/master/vertical-pod-autoscaler/docs/features.md#in-place-updates-inplaceorrecreate).
103107
* **Initial mode**: VPA updates the request-to-limit ratio only during Pod creation and does not change it later.
104-
2. If the `RequestToLimitRatio` feature gate is disabled, the request-to-limit ratio already set in the workload API (e.g. Deployment API) is used.
105-
106-
> [!IMPORTANT]
107-
> This new feature can be used together with other features in development, such as the [fixed memory-per-CPU ratio feature](https://github.com/kubernetes/autoscaler/pull/8459).
108+
2. If the `RequestToLimitRatio` feature gate is enabled and a user does not specify the sub-field `RequestToLimitRatio` on a VPA object, the request-to-limit ratio already set in the workload API (e.g. Deployment API) is used.
109+
3. If the `RequestToLimitRatio` feature gate is disabled, the request-to-limit ratio already set in the workload API (e.g. Deployment API) is used.
108110

109111
### Validation
110112

@@ -114,12 +116,13 @@ The behavior after implementing this feature is as follows:
114116

115117
* The `RequestToLimitRatio` configuration will be validated when VPA CRD objects are created or updated. For example:
116118
* If `Type` is `Factor`, the value must be greater than or equal to 1 (enforced via CRD validation rules).
117-
* If `Type` is `QuantityPercentage`, the value must be greater than or equal to 1 (enforced via CRD validation rules).
118119

119120
#### Dynamic Validation via Admission Controller
120121

121-
* When using the new `RequestToLimitRatio` field, the `controlledValues` field must be set to `RequestsAndLimits`. It does not make sense to specify `RequestToLimitRatio` if VPA is not allowed to update limits. This requirement is enforced by the admission controller.
122-
* If `Type` is set to `QuantityValue`, then its `Value` will be validated.
122+
* When using the new `RequestToLimitRatio` field, the `controlledValues` field must be set to `RequestsAndLimits`. It does not make sense to specify `RequestToLimitRatio` if VPA is not allowed to update limits. This requirement is enforced by the admission controller.
123+
* Explicitly prohibit the use of `RequestToLimitRatio` for any resource not listed in `controlledResources`. For example, if the intention is to set a custom ratio for CPU, then the value of the `controlledResources` field must include `cpu`.
124+
* If `Type` is set to `Quantity`, then its value will be validated using the [ParseQuantity](https://github.com/kubernetes/apimachinery/blob/v0.34.1/pkg/api/resource/quantity.go#L277) function from `apimachinery`.
125+
123126

124127
### Feature Enablement and Rollback
125128

@@ -133,7 +136,13 @@ The behavior after implementing this feature is as follows:
133136
#### When Enabled
134137

135138
* The admission controller will **accept** new VPA objects that include a configured `RequestToLimitRatio`.
136-
* For containers targeted by a VPA object using `RequestToLimitRatio`, the admission controller and/or the updater will enforce the configured ratio.
139+
* For containers targeted by a VPA object using `RequestToLimitRatio`, the admission controller and/or the updater will enforce the configured ratio. Here are some examples of how this may happen:
140+
* **From default to a specific ratio**: This occurs when we have running Pods targeted by a VPA object that does not define `RequestToLimitRatio`. In this case, the Pods use the default ratio derived from the workload API (e.g. Deployment). Once we specify a custom ratio using the `RequestToLimitRatio` field, the new ratio is not applied immediately, as the updater still relies on its current behavior to decide when to evict Pods or perform in-place updates. With the `InPlaceOrRecreate` mode, two possibilities exist:
141+
1. If the new ratio does **not** change the QoS class, the updater will attempt to apply the new ratio using an in-place update. If the in-place update cannot be completed in time, it will evict the Pod to force the change.
142+
2. If the new ratio **does** change the QoS class, the updater will evict the Pod, since the QoS class field is immutable and in-place updates are not possible.
143+
* **From one ratio to another**:
144+
In this case, the default ratio defined in the workload API is ignored, and the ratio specified in the `RequestToLimitRatio` field is enforced. The same logic from the first example applies (see points 1 and 2 above).
145+
137146

138147
#### When Disabled
139148

@@ -149,8 +158,12 @@ The behavior after implementing this feature is as follows:
149158
### Test Plan
150159

151160
* Implement comprehensive unit tests to cover all new functionality.
152-
* e2e tests: TODO
153161

162+
#### E2E
163+
164+
* e2e tests with `InPlaceOrRecreate` VPA mode:
165+
1. Add a test case where the QoS class **changes**. In this scenario, we expect the updater to evict the affected Pods, since the QoS field is immutable. The resulting limits are verified.
166+
2. Add a test case where the QoS class **does not change**. In this scenario the updater should apply the new ratio using the in-place update mechanism. The resulting limits are verified.
154167

155168
### Examples
156169

@@ -179,14 +192,14 @@ spec:
179192
controlledValues: RequestsAndLimits
180193
RequestToLimitRatio:
181194
cpu:
182-
Type: Factor
183-
Value: 2
195+
type: Factor # this field is optional, if omitted it defaults to "Factor"
196+
factor: 2
184197
memory:
185-
Type: QuantityValue
186-
Value: 200Mi
198+
type: Quantity
199+
quantity: 200Mi
187200
```
188201
189-
In the manifest below, we configure VPA to control only the CPU resource's requests and limits for the container named `app`. The CPU limit is calculated by increasing the recommended CPU request by 30%.
202+
In the manifest below, we configure VPA to control only the CPU resource's requests and limits for the container named `app`. The CPU limit is calculated by increasing the recommended CPU request by 20% (i.e. `recommended request × 1.2`).
190203

191204
```yaml
192205
apiVersion: autoscaling.k8s.io/v1
@@ -207,10 +220,11 @@ spec:
207220
controlledValues: RequestsAndLimits
208221
RequestToLimitRatio:
209222
cpu:
210-
Type: QuantityPercentage
211-
Value: 30
223+
type: Factor # this field is optional, if omitted it defaults to "Factor"
224+
factor: 1.2
212225
```
213226

214227
## Implementation History
215228

216-
* 2025-09-10: Initial proposal created.
229+
* 2025-09-10: Initial proposal created.
230+
* 2025-09-18: Update API for consistency. Add e2e tests and other small updates.

0 commit comments

Comments
 (0)