Skip to content

Commit edfda70

Browse files
authored
Merge pull request #7560 from ialidzhikov/enh/global-max-allowed-flags-post-processor
vpa-recommender: Add support for configuring global max allowed resources
2 parents b50491c + bab77cd commit edfda70

File tree

7 files changed

+367
-64
lines changed

7 files changed

+367
-64
lines changed

vertical-pod-autoscaler/docs/examples.md

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@
1111
- [Controlling eviction behavior based on scaling direction and resource](#controlling-eviction-behavior-based-on-scaling-direction-and-resource)
1212
- [Limiting which namespaces are used](#limiting-which-namespaces-are-used)
1313
- [Setting the webhook failurePolicy](#setting-the-webhook-failurepolicy)
14+
- [Specifying global maximum allowed resources to prevent pods from being unschedulable](#specifying-global-maximum-allowed-resources-to-prevent-pods-from-being-unschedulable)
1415

1516
## Keeping limit proportional to request
1617

@@ -108,3 +109,16 @@ These options cannot be used together and are mutually exclusive.
108109
It is possible to set the failurePolicy of the webhook to `Fail` by passing `--webhook-failure-policy-fail=true` to the VPA admission controller.
109110
Please use this option with caution as it may be possible to break Pod creation if there is a failure with the VPA.
110111
Using it in conjunction with `--ignored-vpa-object-namespaces=kube-system` or `--vpa-object-namespace` to reduce risk.
112+
113+
### Specifying global maximum allowed resources to prevent pods from being unschedulable
114+
115+
The [Known limitations dcoument](./known-limitations.md) outlines that VPA (vpa-recommender in particular) is not aware of the cluster's maximum allocatable and can recommend resources which will not fit even the largest node in the cluster. This issue occurs even when the cluster uses the [Cluster Autoscaler](https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md#basics). The vpa-recommender's resource recommendation can exceed the allocatable of the largest node in the cluster. Hence, pod's will be unschedulable (in `Pending` state) and the pod wouldn't fit the cluster even if a new node is added by the Cluster Autoscaler.
116+
It is possible to mitigate this issue by specifying the `--container-recommendation-max-allowed-cpu` and `--container-recommendation-max-allowed-memory` flags of the vpa-recommender. These flags represent the global maximum amount of cpu/memory that will be recommended **for a container**. If the VerticalPodAutoscaler already defines a max allowed (`.spec.resourcePolicy.containerPolicies.maxAllowed`) then it takes precedence over the global max allowed. The global max allowed is merged to the VerticalPodAutoscaler's max allowed if VerticalPodAutoscaler's max allowed is specified only for cpu or memory. If the VerticalPodAutoscaler does not specify a max allowed and a global max allowed is specified, then the global max allowed is being used.
117+
118+
The recommendation for computing the `--container-recommendation-max-allowed-cpu` and `--container-recommendation-max-allowed-memory` values for your cluster is to use the largest node's allocatable (`.status.allocatable` field of a node) minus the DaemonSet pods resource requests minus a safety margin:
119+
```
120+
<max allowed> = <largest node's allocatable> - <resource requests of DaemonSet pods> - <safety margin>
121+
```
122+
123+
> [!WARNING]
124+
> Pay attention that `--container-recommendation-max-allowed-cpu` and `--container-recommendation-max-allowed-memory` are **container-level** flags. A pod enabling autoscaling for more than one container can theoretically still get unschedulable if the sum of the resource recommendations of the containers exceeds the largest Node's allocatable. In practice, it is not very likely to hit such case as usually a single container in a pod is the main one, the others are sidecars that either do not need autoscaling or do not consume high resource requests.

vertical-pod-autoscaler/docs/faq.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -215,6 +215,8 @@ Name | Type | Description | Default
215215
`container-pod-name-label` | String | Label name to look for container pod names | "pod_name"
216216
`container-name-label` | String | Label name to look for container names | "name"
217217
`vpa-object-namespace` | String | Namespace to search for VPA objects and pod stats. Empty means all namespaces will be used. | apiv1.NamespaceAll
218+
`container-recommendation-max-allowed-cpu` | Quantity | Maximum amount of CPU that will be recommended for a container. VerticalPodAutoscaler-level maximum allowed takes precedence over the global maximum allowed. | Empty (no max allowed cpu by default)
219+
`container-recommendation-max-allowed-memory` | Quantity | Maximum amount of memory that will be recommended for a container. VerticalPodAutoscaler-level maximum allowed takes precedence over the global maximum allowed. | Empty (no max allowed memory by default)
218220
`memory-aggregation-interval` | Duration | The length of a single interval, for which the peak memory usage is computed. Memory usage peaks are aggregated in multiples of this interval. In other words there is one memory usage sample per interval (the maximum usage over that interval | model.DefaultMemoryAggregationInterval
219221
`memory-aggregation-interval-count` | Int64 | The number of consecutive memory-aggregation-intervals which make up the MemoryAggregationWindowLength which in turn is the period for memory usage aggregation by VPA. In other words, MemoryAggregationWindowLength = memory-aggregation-interval * memory-aggregation-interval-count. | model.DefaultMemoryAggregationIntervalCount
220222
`memory-histogram-decay-half-life` | Duration | The amount of time it takes a historical memory usage sample to lose half of its weight. In other words, a fresh usage sample is twice as 'important' as one with age equal to the half life period. | model.DefaultMemoryHistogramDecayHalfLife

vertical-pod-autoscaler/docs/known-limitations.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,9 @@
2121
- VPA performance has not been tested in large clusters.
2222
- VPA recommendation might exceed available resources (e.g. Node size, available
2323
size, available quota) and cause **pods to go pending**. This can be partly
24-
addressed by using VPA together with [Cluster Autoscaler](https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md#basics).
24+
addressed by:
25+
* using VPA together with [Cluster Autoscaler](https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md#basics) - the drawback of this approach is that pods can still get unschedulable if the recommendation exceeds the largest Node's allocatable.
26+
* specifying the `--container-recommendation-max-allowed-cpu` and `--container-recommendation-max-allowed-memory` flags - the drawback of this approach is that a pod can still get unschedulable if more than one container in the pod is scaled by VPA and the sum of the container recommendations exceeds the largest Node's allocatable.
2527
- Multiple VPA resources matching the same pod have undefined behavior.
2628
- Running the vpa-recommender with leader election enabled (`--leader-elect=true`) in a GKE cluster
2729
causes contention with a lease called `vpa-recommender` held by the GKE system component of the

vertical-pod-autoscaler/pkg/recommender/main.go

Lines changed: 22 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,7 @@ import (
2525

2626
"github.com/spf13/pflag"
2727
apiv1 "k8s.io/api/core/v1"
28+
"k8s.io/apimachinery/pkg/api/resource"
2829
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
2930
"k8s.io/apimachinery/pkg/util/uuid"
3031
"k8s.io/client-go/informers"
@@ -103,6 +104,8 @@ var (
103104
var (
104105
// CPU as integer to benefit for CPU management Static Policy ( https://kubernetes.io/docs/tasks/administer-cluster/cpu-management-policies/#static-policy )
105106
postProcessorCPUasInteger = flag.Bool("cpu-integer-post-processor-enabled", false, "Enable the cpu-integer recommendation post processor. The post processor will round up CPU recommendations to a whole CPU for pods which were opted in by setting an appropriate label on VPA object (experimental)")
107+
maxAllowedCPU = resource.QuantityValue{}
108+
maxAllowedMemory = resource.QuantityValue{}
106109
)
107110

108111
const (
@@ -115,6 +118,11 @@ const (
115118
defaultResyncPeriod time.Duration = 10 * time.Minute
116119
)
117120

121+
func init() {
122+
flag.Var(&maxAllowedCPU, "container-recommendation-max-allowed-cpu", "Maximum amount of CPU that will be recommended for a container. VerticalPodAutoscaler-level maximum allowed takes precedence over the global maximum allowed.")
123+
flag.Var(&maxAllowedMemory, "container-recommendation-max-allowed-memory", "Maximum amount of memory that will be recommended for a container. VerticalPodAutoscaler-level maximum allowed takes precedence over the global maximum allowed.")
124+
}
125+
118126
func main() {
119127
commonFlags := common.InitCommonFlags()
120128
klog.InitFlags(nil)
@@ -221,8 +229,9 @@ func run(healthCheck *metrics.HealthCheck, commonFlag *common.CommonFlags) {
221229
postProcessors = append(postProcessors, &routines.IntegerCPUPostProcessor{})
222230
}
223231

232+
globalMaxAllowed := initGlobalMaxAllowed()
224233
// CappingPostProcessor, should always come in the last position for post-processing
225-
postProcessors = append(postProcessors, &routines.CappingPostProcessor{})
234+
postProcessors = append(postProcessors, routines.NewCappingRecommendationProcessor(globalMaxAllowed))
226235
var source input_metrics.PodMetricsLister
227236
if *useExternalMetrics {
228237
resourceMetrics := map[apiv1.ResourceName]string{}
@@ -316,3 +325,15 @@ func run(healthCheck *metrics.HealthCheck, commonFlag *common.CommonFlags) {
316325
healthCheck.UpdateLastActivity()
317326
}
318327
}
328+
329+
func initGlobalMaxAllowed() apiv1.ResourceList {
330+
result := make(apiv1.ResourceList)
331+
if !maxAllowedCPU.Quantity.IsZero() {
332+
result[apiv1.ResourceCPU] = maxAllowedCPU.Quantity
333+
}
334+
if !maxAllowedMemory.Quantity.IsZero() {
335+
result[apiv1.ResourceMemory] = maxAllowedMemory.Quantity
336+
}
337+
338+
return result
339+
}

vertical-pod-autoscaler/pkg/recommender/routines/capping_post_processor.go

Lines changed: 16 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -19,20 +19,30 @@ package routines
1919
import (
2020
"k8s.io/klog/v2"
2121

22+
apiv1 "k8s.io/api/core/v1"
23+
2224
vpa_types "k8s.io/autoscaler/vertical-pod-autoscaler/pkg/apis/autoscaling.k8s.io/v1"
2325
vpa_utils "k8s.io/autoscaler/vertical-pod-autoscaler/pkg/utils/vpa"
2426
)
2527

26-
// CappingPostProcessor ensure that the policy is applied to recommendation
27-
// it applies policy for fields: MinAllowed and MaxAllowed
28-
type CappingPostProcessor struct{}
28+
type cappingPostProcessor struct {
29+
globalMaxAllowed apiv1.ResourceList
30+
}
31+
32+
var _ RecommendationPostProcessor = &cappingPostProcessor{}
2933

30-
var _ RecommendationPostProcessor = &CappingPostProcessor{}
34+
// NewCappingRecommendationProcessor constructs new RecommendationPostProcessor that adjusts recommendation
35+
// for given pod to obey VPA resources policy and a global max allowed configuration.
36+
func NewCappingRecommendationProcessor(globalMaxAllowed apiv1.ResourceList) RecommendationPostProcessor {
37+
return &cappingPostProcessor{
38+
globalMaxAllowed: globalMaxAllowed,
39+
}
40+
}
3141

3242
// Process apply the capping post-processing to the recommendation. (use to be function getCappedRecommendation)
33-
func (c CappingPostProcessor) Process(vpa *vpa_types.VerticalPodAutoscaler, recommendation *vpa_types.RecommendedPodResources) *vpa_types.RecommendedPodResources {
43+
func (c cappingPostProcessor) Process(vpa *vpa_types.VerticalPodAutoscaler, recommendation *vpa_types.RecommendedPodResources) *vpa_types.RecommendedPodResources {
3444
// TODO: maybe rename the vpa_utils.ApplyVPAPolicy to something that mention that it is doing capping only
35-
cappedRecommendation, err := vpa_utils.ApplyVPAPolicy(recommendation, vpa.Spec.ResourcePolicy)
45+
cappedRecommendation, err := vpa_utils.ApplyVPAPolicy(recommendation, vpa.Spec.ResourcePolicy, c.globalMaxAllowed)
3646
if err != nil {
3747
klog.ErrorS(err, "Failed to apply policy for VPA", "vpa", klog.KObj(vpa))
3848
return recommendation

vertical-pod-autoscaler/pkg/utils/vpa/capping.go

Lines changed: 34 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -185,23 +185,46 @@ func applyVPAPolicy(recommendation apiv1.ResourceList, policy *vpa_types.Contain
185185

186186
func applyVPAPolicyForContainer(containerName string,
187187
containerRecommendation *vpa_types.RecommendedContainerResources,
188-
policy *vpa_types.PodResourcePolicy) (*vpa_types.RecommendedContainerResources, error) {
188+
policy *vpa_types.PodResourcePolicy,
189+
globalMaxAllowed apiv1.ResourceList) (*vpa_types.RecommendedContainerResources, error) {
189190
if containerRecommendation == nil {
190191
return nil, fmt.Errorf("no recommendation available for container name %v", containerName)
191192
}
192193
cappedRecommendations := containerRecommendation.DeepCopy()
193-
// containerPolicy can be nil (user does not have to configure it).
194194
containerPolicy := GetContainerResourcePolicy(containerName, policy)
195-
if containerPolicy == nil {
196-
return cappedRecommendations, nil
195+
196+
var minAllowed apiv1.ResourceList
197+
if containerPolicy != nil {
198+
minAllowed = containerPolicy.MinAllowed
199+
}
200+
201+
var maxAllowed apiv1.ResourceList
202+
if containerPolicy != nil {
203+
// Deep copy containerPolicy.MaxAllowed as maxAllowed can later on be merged with globalMaxAllowed.
204+
// Deep copy is needed to prevent unwanted modifications to containerPolicy.MaxAllowed.
205+
maxAllowed = containerPolicy.MaxAllowed.DeepCopy()
206+
}
207+
if maxAllowed == nil {
208+
maxAllowed = globalMaxAllowed
209+
} else {
210+
// Set resources from the global max allowed if the VPA max allowed is missing them.
211+
for resourceName, quantity := range globalMaxAllowed {
212+
if _, ok := maxAllowed[resourceName]; !ok {
213+
maxAllowed[resourceName] = quantity
214+
}
215+
}
197216
}
198217

199218
process := func(recommendation apiv1.ResourceList) {
200-
for resourceName, recommended := range recommendation {
201-
cappedToMin, _ := maybeCapToPolicyMin(recommended, resourceName, containerPolicy)
202-
recommendation[resourceName] = cappedToMin
203-
cappedToMax, _ := maybeCapToPolicyMax(cappedToMin, resourceName, containerPolicy)
204-
recommendation[resourceName] = cappedToMax
219+
for resourceName := range recommendation {
220+
if minAllowed != nil {
221+
cappedToMin, _ := maybeCapToMin(recommendation[resourceName], resourceName, minAllowed)
222+
recommendation[resourceName] = cappedToMin
223+
}
224+
if maxAllowed != nil {
225+
cappedToMax, _ := maybeCapToMax(recommendation[resourceName], resourceName, maxAllowed)
226+
recommendation[resourceName] = cappedToMax
227+
}
205228
}
206229
}
207230

@@ -242,19 +265,16 @@ func maybeCapToMin(recommended resource.Quantity, resourceName apiv1.ResourceNam
242265

243266
// ApplyVPAPolicy returns a recommendation, adjusted to obey policy.
244267
func ApplyVPAPolicy(podRecommendation *vpa_types.RecommendedPodResources,
245-
policy *vpa_types.PodResourcePolicy) (*vpa_types.RecommendedPodResources, error) {
268+
policy *vpa_types.PodResourcePolicy, globalMaxAllowed apiv1.ResourceList) (*vpa_types.RecommendedPodResources, error) {
246269
if podRecommendation == nil {
247270
return nil, nil
248271
}
249-
if policy == nil {
250-
return podRecommendation, nil
251-
}
252272

253273
updatedRecommendations := []vpa_types.RecommendedContainerResources{}
254274
for _, containerRecommendation := range podRecommendation.ContainerRecommendations {
255275
containerName := containerRecommendation.ContainerName
256276
updatedContainerResources, err := applyVPAPolicyForContainer(containerName,
257-
&containerRecommendation, policy)
277+
&containerRecommendation, policy, globalMaxAllowed)
258278
if err != nil {
259279
return nil, fmt.Errorf("cannot apply policy on recommendation for container name %v", containerName)
260280
}

0 commit comments

Comments
 (0)