Skip to content

Commit 2b33c4c

Browse files
authored
Merge pull request #8115 from maxcao13/kubernetes-in-place-updates
VPA: Implement in-place updates support
2 parents 2ca7513 + 3039f3c commit 2b33c4c

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

48 files changed

+3107
-649
lines changed

vertical-pod-autoscaler/deploy/admission-controller-deployment.yaml

Lines changed: 0 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -47,15 +47,3 @@ spec:
4747
- name: tls-certs
4848
secret:
4949
secretName: vpa-tls-certs
50-
---
51-
apiVersion: v1
52-
kind: Service
53-
metadata:
54-
name: vpa-webhook
55-
namespace: kube-system
56-
spec:
57-
ports:
58-
- port: 443
59-
targetPort: 8000
60-
selector:
61-
app: vpa-admission-controller
Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
apiVersion: v1
2+
kind: Service
3+
metadata:
4+
name: vpa-webhook
5+
namespace: kube-system
6+
spec:
7+
ports:
8+
- port: 443
9+
targetPort: 8000
10+
selector:
11+
app: vpa-admission-controller

vertical-pod-autoscaler/deploy/vpa-rbac.yaml

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -124,6 +124,32 @@ rules:
124124
- create
125125
---
126126
apiVersion: rbac.authorization.k8s.io/v1
127+
kind: ClusterRole
128+
metadata:
129+
name: system:vpa-updater-in-place
130+
rules:
131+
- apiGroups:
132+
- ""
133+
resources:
134+
- pods/resize
135+
- pods # required for patching vpaInPlaceUpdated annotations onto the pod
136+
verbs:
137+
- patch
138+
---
139+
apiVersion: rbac.authorization.k8s.io/v1
140+
kind: ClusterRoleBinding
141+
metadata:
142+
name: system:vpa-updater-in-place-binding
143+
roleRef:
144+
apiGroup: rbac.authorization.k8s.io
145+
kind: ClusterRole
146+
name: system:vpa-updater-in-place
147+
subjects:
148+
- kind: ServiceAccount
149+
name: vpa-updater
150+
namespace: kube-system
151+
---
152+
apiVersion: rbac.authorization.k8s.io/v1
127153
kind: ClusterRoleBinding
128154
metadata:
129155
name: system:metrics-reader

vertical-pod-autoscaler/deploy/vpa-v1-crd-gen.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -458,6 +458,7 @@ spec:
458458
- "Off"
459459
- Initial
460460
- Recreate
461+
- InPlaceOrRecreate
461462
- Auto
462463
type: string
463464
type: object

vertical-pod-autoscaler/docs/api.md

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -155,7 +155,7 @@ _Appears in:_
155155

156156
| Field | Description | Default | Validation |
157157
| --- | --- | --- | --- |
158-
| `updateMode` _[UpdateMode](#updatemode)_ | Controls when autoscaler applies changes to the pod resources.<br />The default is 'Auto'. | | Enum: [Off Initial Recreate Auto] <br /> |
158+
| `updateMode` _[UpdateMode](#updatemode)_ | Controls when autoscaler applies changes to the pod resources.<br />The default is 'Auto'. | | Enum: [Off Initial Recreate InPlaceOrRecreate Auto] <br /> |
159159
| `minReplicas` _integer_ | Minimal number of replicas which need to be alive for Updater to attempt<br />pod eviction (pending other checks like PDB). Only positive values are<br />allowed. Overrides global '--min-replicas' flag. | | |
160160
| `evictionRequirements` _[EvictionRequirement](#evictionrequirement) array_ | EvictionRequirements is a list of EvictionRequirements that need to<br />evaluate to true in order for a Pod to be evicted. If more than one<br />EvictionRequirement is specified, all of them need to be fulfilled to allow eviction. | | |
161161

@@ -208,7 +208,7 @@ _Underlying type:_ _string_
208208
UpdateMode controls when autoscaler applies changes to the pod resources.
209209

210210
_Validation:_
211-
- Enum: [Off Initial Recreate Auto]
211+
- Enum: [Off Initial Recreate InPlaceOrRecreate Auto]
212212

213213
_Appears in:_
214214
- [PodUpdatePolicy](#podupdatepolicy)
@@ -218,7 +218,8 @@ _Appears in:_
218218
| `Off` | UpdateModeOff means that autoscaler never changes Pod resources.<br />The recommender still sets the recommended resources in the<br />VerticalPodAutoscaler object. This can be used for a "dry run".<br /> |
219219
| `Initial` | UpdateModeInitial means that autoscaler only assigns resources on pod<br />creation and does not change them during the lifetime of the pod.<br /> |
220220
| `Recreate` | UpdateModeRecreate means that autoscaler assigns resources on pod<br />creation and additionally can update them during the lifetime of the<br />pod by deleting and recreating the pod.<br /> |
221-
| `Auto` | UpdateModeAuto means that autoscaler assigns resources on pod creation<br />and additionally can update them during the lifetime of the pod,<br />using any available update method. Currently this is equivalent to<br />Recreate, which is the only available update method.<br /> |
221+
| `Auto` | UpdateModeAuto means that autoscaler assigns resources on pod creation<br />and additionally can update them during the lifetime of the pod,<br />using any available update method. Currently this is equivalent to<br />Recreate.<br /> |
222+
| `InPlaceOrRecreate` | UpdateModeInPlaceOrRecreate means that autoscaler tries to assign resources in-place.<br />If this is not possible (e.g., resizing takes too long or is infeasible), it falls back to the<br />"Recreate" update mode.<br />Requires VPA level feature gate "InPlaceOrRecreate" to be enabled<br />on the admission and updater pods.<br />Requires cluster feature gate "InPlacePodVerticalScaling" to be enabled.<br /> |
222223

223224

224225
#### VerticalPodAutoscaler

vertical-pod-autoscaler/docs/features.md

Lines changed: 77 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,8 @@
44

55
- [Limits control](#limits-control)
66
- [Memory Value Humanization](#memory-value-humanization)
7+
- [CPU Recommendation Rounding](#cpu-recommendation-rounding)
8+
- [In-Place Updates](#in-place-updates-inplaceorrecreate)
79

810
## Limits control
911

@@ -50,4 +52,78 @@ To enable this feature, set the --round-cpu-millicores flag when running the VPA
5052

5153
```bash
5254
--round-cpu-millicores=50
53-
```
55+
```
56+
57+
## In-Place Updates (`InPlaceOrRecreate`)
58+
59+
> [!WARNING]
60+
> FEATURE STATE: VPA v1.4.0 [alpha]
61+
62+
VPA supports in-place updates to reduce disruption when applying resource recommendations. This feature leverages Kubernetes' in-place update capabilities (which is in beta as of Kubernetes 1.33) to modify container resources without requiring pod recreation.
63+
For more information, see [AEP-4016: Support for in place updates in VPA](https://github.com/kubernetes/autoscaler/tree/master/vertical-pod-autoscaler/enhancements/4016-in-place-updates-support)
64+
65+
### Usage
66+
67+
To use in-place updates, set the VPA's `updateMode` to `InPlaceOrRecreate`:
68+
```yaml
69+
apiVersion: autoscaling.k8s.io/v1
70+
kind: VerticalPodAutoscaler
71+
metadata:
72+
name: my-vpa
73+
spec:
74+
updatePolicy:
75+
updateMode: "InPlaceOrRecreate"
76+
```
77+
78+
### Behavior
79+
80+
When using `InPlaceOrRecreate` mode, VPA will first attempt to apply updates in-place, if in-place update fails, VPA will fall back to pod recreation.
81+
Updates are attempted when:
82+
* Container requests are outside the recommended bounds
83+
* Quick OOM occurs
84+
* For long-running pods (>12h), when recommendations differ significantly (>10%)
85+
86+
Important Notes
87+
88+
* Disruption Possibility: While in-place updates aim to minimize disruption, they cannot guarantee zero disruption as the container runtime is responsible for the actual resize operation.
89+
90+
* Memory Limit Downscaling: In the beta version, memory limit downscaling is not supported for pods with resizePolicy: PreferNoRestart. In such cases, VPA will fall back to pod recreation.
91+
92+
### Requirements:
93+
94+
* Kubernetes 1.33+ with `InPlacePodVerticalScaling` feature gate enabled
95+
* VPA version 1.4.0+ with `InPlaceOrRecreate` feature gate enabled
96+
97+
### Configuration
98+
99+
Enable the feature by setting the following flags in VPA components ( for both updater and admission-controller ):
100+
101+
```bash
102+
--feature-gates=InPlaceOrRecreate=true
103+
```
104+
105+
### Limitations
106+
107+
* All containers in a pod are updated together (partial updates not supported)
108+
* Memory downscaling requires careful consideration to prevent OOMs
109+
* Updates still respect VPA's standard update conditions and timing restrictions
110+
* In-place updates will fail if they would result in a change to the pod's QoS class
111+
112+
### Fallback Behavior
113+
114+
VPA will fall back to pod recreation in the following scenarios:
115+
116+
* In-place update is [infeasible](https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/1287-in-place-update-pod-resources/README.md#resize-status) (node resources, etc.)
117+
* Update is [deferred](https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/1287-in-place-update-pod-resources/README.md#resize-status) for more than 5 minutes
118+
* Update is in progress for more than 1 hour
119+
* [Pod QoS](https://kubernetes.io/docs/concepts/workloads/pods/pod-qos/) class would change due to the update
120+
* Memory limit downscaling is required with [PreferNoRestart policy](https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/1287-in-place-update-pod-resources/README.md#container-resize-policy)
121+
122+
### Monitoring
123+
124+
VPA provides metrics to track in-place update operations:
125+
126+
* `vpa_in_place_updatable_pods_total`: Number of pods matching in-place update criteria
127+
* `vpa_in_place_updated_pods_total`: Number of pods successfully updated in-place
128+
* `vpa_vpas_with_in_place_updatable_pods_total`: Number of VPAs with pods eligible for in-place updates
129+
* `vpa_vpas_with_in_place_updated_pods_total`: Number of VPAs with successfully in-place updated pods

vertical-pod-autoscaler/docs/flags.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@ This document is auto-generated from the flag definitions in the VPA admission-c
1414
| `--address` | ":8944" | The address to expose Prometheus metrics. |
1515
| `--alsologtostderr` | | log to standard error as well as files (no effect when -logtostderr=true) |
1616
| `--client-ca-file` | "/etc/tls-certs/caCert.pem" | Path to CA PEM file. |
17+
| `--feature-gates` | | A set of key=value pairs that describe feature gates for alpha/experimental features. Options are: |
1718
| `--ignored-vpa-object-namespaces` | | A comma-separated list of namespaces to ignore when searching for VPA objects. Leave empty to avoid ignoring any namespaces. These namespaces will not be cleaned by the garbage collector. |
1819
| `--kube-api-burst` | 10 | QPS burst limit when making requests to Kubernetes apiserver |
1920
| `--kube-api-qps` | 5 | QPS limit when making requests to Kubernetes apiserver |
@@ -67,6 +68,7 @@ This document is auto-generated from the flag definitions in the VPA recommender
6768
| `--cpu-integer-post-processor-enabled` | | Enable the cpu-integer recommendation post processor. The post processor will round up CPU recommendations to a whole CPU for pods which were opted in by setting an appropriate label on VPA object (experimental) |
6869
| `--external-metrics-cpu-metric` | | ALPHA. Metric to use with external metrics provider for CPU usage. |
6970
| `--external-metrics-memory-metric` | | ALPHA. Metric to use with external metrics provider for memory usage. |
71+
| `--feature-gates` | | A set of key=value pairs that describe feature gates for alpha/experimental features. Options are: |
7072
| `--history-length` | "8d" | How much time back prometheus have to be queried to get historical metrics |
7173
| `--history-resolution` | "1h" | Resolution at which Prometheus is queried for historical metrics |
7274
| `--humanize-memory` | | Convert memory values in recommendations to the highest appropriate SI unit with up to 2 decimal places for better readability. |
@@ -137,6 +139,7 @@ This document is auto-generated from the flag definitions in the VPA updater cod
137139
| `--eviction-rate-burst` | 1 | Burst of pods that can be evicted. |
138140
| `--eviction-rate-limit` | | Number of pods that can be evicted per seconds. A rate limit set to 0 or -1 will disable |
139141
| `--eviction-tolerance` | 0.5 | Fraction of replica count that can be evicted for update, if more than one pod can be evicted. |
142+
| `--feature-gates` | | A set of key=value pairs that describe feature gates for alpha/experimental features. Options are: |
140143
| `--ignored-vpa-object-namespaces` | | A comma-separated list of namespaces to ignore when searching for VPA objects. Leave empty to avoid ignoring any namespaces. These namespaces will not be cleaned by the garbage collector. |
141144
| `--in-recommendation-bounds-eviction-lifetime-threshold` | 12h0m0s | Pods that live for at least that long can be evicted even if their request is within the [MinRecommended...MaxRecommended] range |
142145
| `--kube-api-burst` | 10 | QPS burst limit when making requests to Kubernetes apiserver |

vertical-pod-autoscaler/docs/installation.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -138,6 +138,16 @@ To print YAML contents with all resources that would be understood by
138138
The output of that command won't include secret information generated by
139139
[pkg/admission-controller/gencerts.sh](https://github.com/kubernetes/autoscaler/tree/master/vertical-pod-autoscaler/pkg/admission-controller/gencerts.sh) script.
140140

141+
### Feature gates
142+
143+
To install VPA with feature gates, you can specify the environment variable `$FEATURE_GATES`.
144+
145+
For example, to enable the `InPlaceOrRecreate` feature gate:
146+
147+
```console
148+
FEATURE_GATES="InPlaceOrRecreate=true" ./hack/vpa-up.sh
149+
```
150+
141151
## Tear down
142152

143153
Note that if you stop running VPA in your cluster, the resource requests

0 commit comments

Comments
 (0)