✨ Add flags to allow customization of CPU and memory shares, reservations and limits #3607

tommasopozzetti · 2025-08-26T18:08:43Z

What this PR does / why we need it:

This PR adds flags to optionally customize CPU shares and reservations for cloned VMs as part of the vSphereMachine spec

k8s-ci-robot · 2025-08-26T18:08:51Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign enxebre for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot · 2025-08-26T18:08:52Z

Welcome @tommasopozzetti!

It looks like this is your first PR to kubernetes-sigs/cluster-api-provider-vsphere 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/cluster-api-provider-vsphere has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

k8s-ci-robot · 2025-08-26T18:08:53Z

Hi @tommasopozzetti. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

smcallister-bc · 2025-09-08T21:04:14Z

This seems incredibly useful, thank you for doing this! Any chance you could do something similar for memory reservations as well?

tommasopozzetti · 2025-09-08T22:45:38Z

I’d be happy to add to the PR similar logic for memory reservation and shares as well as potentially memory pinning

I was hoping to first get a glance from a maintainer to see if this approach is reasonable given it’s my first contribution to this project!

chrischdi · 2025-09-09T08:49:40Z

apis/v1beta1/types.go

+	// CPUReservationMhz is the amount of CPU in MHz that is guaranteed available to the virtual machine.
+	// Defaults to the eponymous property value in the template from which the
+	// virtual machine is cloned.
+	// +optional
+	CPUReservationMhz int64 `json:"cpuReservationMhz,omitempty"`
+	// CPUShares are a relative priority to other virtual machines used in case of resource contention.
+	// Defaults to the eponymous property value in the template from which the
+	// virtual machine is cloned.
+	// +optional
+	CPUShares int32 `json:"cpuShares,omitempty"`


Does it make sense to have something like:

resources: reservation: cpu: ... memory: ... shares: ...

Or maybe this should be modelled in the k8s wordings which has limits and requests? (might not match the things this PR currently sets).

This is e.g. done by the vm-operator APIs. However, vm-operator does not use shares.

Could someone research what the benefits are of setting shares? And should we also consider CPU Limit to be set?

@chrischdi thanks for the review!

In terms of shares vs reservations, shares are a relative measure of prioritization while reservations are an absolute one. A VM with 2Ghz set for reservation will be guaranteed that even under host contention, it will always have 2Ghz of cpu power available to it. The vSphere admission controller will prevent a VM to power on if the total sum of reservations for VMs on a given host exceeds the total available cpu power of that host (no overprovisioning).
Shares on the other hand are just meaningful relative to shares of other VMs on that host. The host can be over provisioned and if it comes under cpu contention, VMs will be prioritized for cpu time relative to each other depending on their shares. So if a host has 3 VMs, one with 6000 shares, one with 3000 and one with 1000, if the host comes under cpu contention, the first will get 60% of cpu time, the second 30% and the third 10%. Each VM normally gets shares assigned by default proportional to its number of vCPUs but it is very useful to be able to tune that at will.

In terms of using cpu/memory limits, I have never had to implement these but they essentially would artificially cause the same effects as if the underlying host was under resource contention even if it is not, when the VM reaches said limit. More detailed info here. I'd be happy to add to this PR the limit as well as an optional configurable if desired.

Finally, in terms of the syntax, I'm open to suggestions! I personally feel like using the same syntax as standard Kubernetes containers might be misleading since the practical implementation of using reservations, shares and limits on VMs is very different than memory and cpu requests and limits for k8s pods.
I was going for a more flat mapping similar to the other properties that matches with the VM options and would look like

cpuReservationMhz: xxx cpuShares: xxx cpuLimitMhz: xxx memoryReservationMB: xxx memoryShares: xxx memoryLimitMB: xxx reserveAllMemory: false

but, if preferred, we could also go for something nested like

resourceManagement: cpu: reservationMhz: shares: limitMhz: memory: reservationMB: shares: limitMB: reserveAll:

or similar

I'd prefer to takeover the definition how it is in vm-operator to have a similar API.

Which comes down to:

resources: requests: # --> reservations cpu: ... # in mhz, documented on the godoc memory: ... limits: # --> limits cpu: ... memory: ...

And also use the types which are common for kubernetes.

https://github.com/vmware-tanzu/vm-operator/blob/main/api/v1alpha5/virtualmachineclass_types.go#L82-L86

The fields should have proper description to what this fields map at the end.

For shares: see my other comment.

Note: having the unit in the name does not make sense to me. At least not for memory, and I think also not for CPU.
If we have a proper godoc that explains that this field is in hz, 2GhZ should map to e.g. "2Gi".

tommasopozzetti · 2025-09-16T18:35:25Z

Hi @chrischdi, following up on your comment, do you have any thoughts on my reply?
Thanks!

tommasopozzetti · 2025-10-16T15:24:00Z

@chrischdi or @sbueringer do you have any opinions on the above discussion? I'd love to do any edits that make sense to push this forward! This feature is one that we really need
Thank you!

chrischdi

First of all: sorry for the long run on this and thanks for reminding me.

I'm fine with adding api and implementation which properly sets reservation (= requests in kubernetes) and limits for CPU and memory.

I'd like to again think about the use-case of setting shares for a VM. I want to understand the use-case so we don't add an api which won't really get used at the later stage.

VM-Operator does not have them configurable. And I think there's a reason.

If I get it right, setting shares only take effect if there is a lack of resources affecting all the VMs of the vSphere host / cluster / resourcepool (depending on setup).
So I think there's no way, or it is pretty hard, to set shares to a value that makes sense across the whole infrastructure.
The real solution here should be adding more physical capacity.

Shares can be used to prioritize resource availability for VMs at the time of contention.

[0]

@tommasopozzetti : I'd like to better understand the use-case you have for setting shares?

Maybe @akutz has some thoughts here.

@vr4manta maybe you have some thoughts here too?

chrischdi · 2025-10-17T10:59:56Z

apis/v1beta1/types.go

+	// CPUReservationMhz is the amount of CPU in MHz that is guaranteed available to the virtual machine.
+	// Defaults to the eponymous property value in the template from which the
+	// virtual machine is cloned.
+	// +optional
+	CPUReservationMhz int64 `json:"cpuReservationMhz,omitempty"`
+	// CPUShares are a relative priority to other virtual machines used in case of resource contention.
+	// Defaults to the eponymous property value in the template from which the
+	// virtual machine is cloned.
+	// +optional
+	CPUShares int32 `json:"cpuShares,omitempty"`


I'd prefer to takeover the definition how it is in vm-operator to have a similar API.

Which comes down to:

resources: requests: # --> reservations cpu: ... # in mhz, documented on the godoc memory: ... limits: # --> limits cpu: ... memory: ...

And also use the types which are common for kubernetes.

https://github.com/vmware-tanzu/vm-operator/blob/main/api/v1alpha5/virtualmachineclass_types.go#L82-L86

The fields should have proper description to what this fields map at the end.

For shares: see my other comment.

tommasopozzetti · 2025-10-17T15:27:56Z

@chrischdi thank you for your review!

I will make the changes to the PR to implement the syntax you suggested to match vm-operator.
I would like to include shares as well in the design though. I'm personally not familiar with vm-operator but I can share some of the thoughts behind the use.

First and foremost, shares are in use all the time out of the box even if you do not set them. vSphere will assign automatically shares to every VM proportional to the chosen priority of the VM (or the default one for the resource pool if not defined) and the number of cores of the VM. So shares will be used regardless of our implementation here. My proposed addition just gives more optional control over them allowing to map the corresponding available api to configure a custom value of shares if desired.

We use this heavily. Shares are the only way to properly set relative priorities among VMs to distinguish among more and less critical workloads and in our case, we have more "classes" of workloads then the simple default 3 levels of priorities available so the ability to customize them is imperative.

We have actually been recommended to use this mechanism from VMware engineers so I assume these are used and valuable.

While I definitely agree that as you run out of capacity, adding capacity is always the best solution, it is not one always readily available and, when running over provisioned (which we are significantly to make efficient use of resources), spikes will always at times cause temporary contention. That contention must somehow be resolved and the way vSphere does that is giving relative cpu time based on the shares. Only using reservations and limits does not address this properly. It just sets a guaranteed minimum and a ceiling maximum for the resource but it does not give any direction to the system of how to prioritize in between that min and max among VMs.
A concrete example: if I have 10Ghz available, and 2 VMs both requesting 4Ghz and with a limit of 8Ghz and both spiking in a moment of high load to try and use 6Ghz each, how does the system choose the allocation of the available 10? What if one of those VMs is a dev workload and one a qa one? How do I explain to the system to prioritize qa? I can give twice as many shares to qa and in that scenario, it will get twice as much of the remaining capacity than the dev one

I can start working on the edits to have the following syntax and ensure types and comments match the vm-operator with the shares addition, but please do let me know in the meantime if you have further thoughts about the shares discussion!

resources:
  requests: # --> reservations
    cpu: ... # in mhz, documented on the godoc
    memory: ...
  limits: # --> limits
    cpu: ...
    memory: ...
  shares:
    cpu: ...
    memory: ...

scott-grimes · 2025-10-17T18:53:29Z

👍 would love this - both the reservations/limits and also shares

chrischdi · 2025-10-18T09:45:36Z

Sounds reasonable, thanks for explaining!

Also the example Looks good to me 👍

Let me know once you updated the PR

tommasopozzetti · 2025-10-21T17:51:39Z

@chrischdi I have updated the PR to follow the proposed syntax!
Let me know your thoughts!
Thank you

sbueringer · 2025-10-22T06:23:06Z

I'll also try to take a look soon

/ok-to-test

tommasopozzetti · 2025-10-22T15:17:00Z

Thanks @sbueringer!
I fixed all the linting issues. There is still one check failing but I'm not sure that it is related to any change here. Any advice would be great!

apis/v1beta1/types.go

chrischdi · 2025-10-23T09:42:14Z

pkg/services/govmomi/vcenter/clone.go

+	// Set CPU reservations, limits and shares if specified
+	cpuAllocation := types.ResourceAllocationInfo{}
+	if !vmCtx.VSphereVM.Spec.Resources.Requests.Cpu.IsZero() {
+		cpuReservationMhz := int64(math.Ceil(float64(vmCtx.VSphereVM.Spec.Resources.Requests.Cpu.Value()) / float64(1000000)))


Can we add helper funcs for convertQuantityToMhz and convertQuantityToMB

chrischdi · 2025-10-23T09:49:57Z

pkg/services/govmomi/vcenter/clone.go

+		}
+		cpuAllocation.Shares = ptr.To(cpuShares)
+	}
+	spec.Config.CpuAllocation = ptr.To(cpuAllocation)


Are there any effects if this is set empty vs nil (which it was before?)

I'd prefer to not change the existing behavior.

So if no requests, limits or shares are set, we should not set CPUAllocation to an empty struct. Same below.

chrischdi · 2025-10-23T09:52:45Z

conversion also need to be fixed to be round-tripable.

See e.g. what we do for AdditionalDisksGiB: https://github.com/openshift/cluster-api-provider-vsphere/blob/393a16983d02d5a2b254e4182002f70329f9dd8f/apis/v1alpha4/vspheremachine_conversion.go#L39

tommasopozzetti · 2025-10-24T18:59:05Z

@chrischdi thanks for he input!
I think I addressed all the concerns.

The only comment I have is on the quantities and the validation. Rather then hard failing if the quantity is not set as a precise multiple of Mhz or MiB, which would be an atypical behavior for resources specification on a k8s resource, I propose (and implemented) to automatically round up to the nearest Mhz or MiB. The behavior has also been documented in the field descriptions.

There look to still be a check failing and it is unclear to me if it's still related to the conversion webhooks or unrelated. The logs don't make it very easy to understand what those checks are doing and why they are failing. I see several errors related to sessions to vCenter and parsing of the APIVersion and Kind, both things that don't have anything to do with the contents of this PR. Any further guidance here is definitely appreciated!
Thank you!

chrischdi · 2025-10-28T09:09:06Z

The only comment I have is on the quantities and the validation. Rather then hard failing if the quantity is not set as a precise multiple of Mhz or MiB, which would be an atypical behavior for resources specification on a k8s resource, I propose (and implemented) to automatically round up to the nearest Mhz or MiB. The behavior has also been documented in the field descriptions.

That's fine for me.

The failing tests are fuzz tests. They try to ensure that the conversions are round-tripable. So the failing tests here are:

v1beta1.VSphereVM -> v1alpha(3/4).VSphereVM -> v1beta1.VSphereVM

And that causes a diff.

Same for VSphereMachineTemplate.

We also need to add the conversion change for VSphereVM and VSphereMachineTemplate, because they also get the additional fields.

tommasopozzetti · 2025-10-28T19:01:10Z

@chrischdi thanks for the pointers!
Looks like those are solved now!
One more test is failing but seems like it's failing on setting up the test environment?
Appreciate any pointers there as well, and thanks for your continued guidance!

tommasopozzetti · 2025-11-05T17:31:29Z

@chrischdi circling back here to see if there is anything else I can do!
Thank you!

chrischdi · 2025-11-05T17:45:16Z

/retest

tommasopozzetti · 2025-11-17T21:15:43Z

@chrischdi yay! Looks like all tests passed!

Would it be possible to get a final review and potentially get this in to be included in the next release? We are very much looking forward to using this feature!
Thank you!

sbueringer

Thx, just one nit from my side

apis/v1beta1/types.go

…ns and limits Signed-off-by: Tommaso <[email protected]>

tommasopozzetti · 2025-11-20T20:37:44Z

/retest

tommasopozzetti · 2025-11-20T21:01:48Z

@sbueringer Thank you! I corrected the CPU field as requested!

sbueringer · 2025-11-21T04:22:32Z

/test pull-cluster-api-provider-vsphere-e2e-govmomi-blocking-main
/test pull-cluster-api-provider-vsphere-e2e-govmomi-conformance-ci-latest-main
/test pull-cluster-api-provider-vsphere-e2e-govmomi-conformance-main
/test pull-cluster-api-provider-vsphere-e2e-govmomi-main
/test pull-cluster-api-provider-vsphere-e2e-govmomi-upgrade-1-34-1-35-main
/test pull-cluster-api-provider-vsphere-e2e-supervisor-blocking-main
/test pull-cluster-api-provider-vsphere-e2e-supervisor-conformance-ci-latest-main
/test pull-cluster-api-provider-vsphere-e2e-supervisor-conformance-main
/test pull-cluster-api-provider-vsphere-e2e-supervisor-main
/test pull-cluster-api-provider-vsphere-e2e-supervisor-upgrade-1-34-1-35-main
/test pull-cluster-api-provider-vsphere-e2e-vcsim-govmomi-main
/test pull-cluster-api-provider-vsphere-e2e-vcsim-supervisor-main

sbueringer · 2025-11-21T04:22:49Z

@sbueringer Thank you! I corrected the CPU field as requested!

Thank you very much!

Would it be possible to get a final review and potentially get this in to be included in the next release? We are very much looking forward to using this feature!
Thank you!

Yup, let's get this merged early next week, so it's part of the next release.
Thx for the patience, sorry for the long delays for review, was a way-too-busy release cycle in core CAPI :)

/assign @chrischdi

for a final review

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Aug 26, 2025

k8s-ci-robot requested review from MaxRink and sbueringer August 26, 2025 18:08

k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Aug 26, 2025

k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Aug 26, 2025

chrischdi reviewed Sep 9, 2025

View reviewed changes

tommasopozzetti requested a review from chrischdi September 24, 2025 17:03

chrischdi reviewed Oct 17, 2025

View reviewed changes

tommasopozzetti force-pushed the feature/cpu-custom-allocation branch from 3d5a4db to 47c75b7 Compare October 21, 2025 17:50

k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Oct 21, 2025

tommasopozzetti changed the title ~~✨Add flags to allow customization of CPU shares and reservations~~ ✨Add flags to allow customization of CPU and memory shares, reservations and limits Oct 21, 2025

tommasopozzetti requested a review from chrischdi October 21, 2025 17:51

k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Oct 22, 2025

tommasopozzetti force-pushed the feature/cpu-custom-allocation branch 2 times, most recently from ab91902 to 64df85a Compare October 22, 2025 14:58

chrischdi reviewed Oct 23, 2025

View reviewed changes

tommasopozzetti force-pushed the feature/cpu-custom-allocation branch 3 times, most recently from 4c59fe7 to 6933d96 Compare October 24, 2025 18:38

tommasopozzetti requested a review from chrischdi October 24, 2025 19:38

tommasopozzetti force-pushed the feature/cpu-custom-allocation branch from 6933d96 to 89944f7 Compare October 28, 2025 16:12

chrischdi changed the title ~~✨Add flags to allow customization of CPU and memory shares, reservations and limits~~ ✨ Add flags to allow customization of CPU and memory shares, reservations and limits Nov 5, 2025

sbueringer reviewed Nov 20, 2025

View reviewed changes

apis/v1beta1/types.go Outdated Show resolved Hide resolved

Add flags to allow customization of CPU and memory shares, reservatio…

8d3bbd4

…ns and limits Signed-off-by: Tommaso <[email protected]>

tommasopozzetti force-pushed the feature/cpu-custom-allocation branch from 89944f7 to 8d3bbd4 Compare November 20, 2025 19:45

kubernetes-sigs deleted a comment from k8s-ci-robot Nov 21, 2025

k8s-ci-robot assigned chrischdi Nov 21, 2025

✨ Add flags to allow customization of CPU and memory shares, reservations and limits #3607

Are you sure you want to change the base?

✨ Add flags to allow customization of CPU and memory shares, reservations and limits #3607

Uh oh!

Conversation

tommasopozzetti commented Aug 26, 2025

Uh oh!

k8s-ci-robot commented Aug 26, 2025

Uh oh!

k8s-ci-robot commented Aug 26, 2025

Uh oh!

k8s-ci-robot commented Aug 26, 2025

Uh oh!

smcallister-bc commented Sep 8, 2025

Uh oh!

tommasopozzetti commented Sep 8, 2025

Uh oh!

chrischdi Sep 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tommasopozzetti Sep 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chrischdi Oct 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chrischdi Oct 17, 2025

Choose a reason for hiding this comment

Uh oh!

tommasopozzetti commented Sep 16, 2025

Uh oh!

tommasopozzetti commented Oct 16, 2025

Uh oh!

chrischdi left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chrischdi Oct 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tommasopozzetti commented Oct 17, 2025

Uh oh!

scott-grimes commented Oct 17, 2025

Uh oh!

chrischdi commented Oct 18, 2025

Uh oh!

tommasopozzetti commented Oct 21, 2025

Uh oh!

sbueringer commented Oct 22, 2025

Uh oh!

tommasopozzetti commented Oct 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

chrischdi Oct 23, 2025

Choose a reason for hiding this comment

Uh oh!

chrischdi Oct 23, 2025

Choose a reason for hiding this comment

Uh oh!

chrischdi commented Oct 23, 2025

Uh oh!

tommasopozzetti commented Oct 24, 2025

Uh oh!

chrischdi commented Oct 28, 2025

Uh oh!

tommasopozzetti commented Oct 28, 2025

Uh oh!

tommasopozzetti commented Nov 5, 2025

Uh oh!

chrischdi commented Nov 5, 2025

Uh oh!

tommasopozzetti commented Nov 17, 2025

Uh oh!

chrischdi Sep 9, 2025 •

edited

Loading

tommasopozzetti Sep 9, 2025 •

edited

Loading

chrischdi Oct 17, 2025 •

edited

Loading

chrischdi left a comment •

edited

Loading

chrischdi Oct 17, 2025 •

edited

Loading

tommasopozzetti commented Oct 22, 2025 •

edited

Loading

sbueringer commented Nov 21, 2025 •

edited

Loading