-
Notifications
You must be signed in to change notification settings - Fork 306
✨ Add flags to allow customization of CPU and memory shares, reservations and limits #3607
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
✨ Add flags to allow customization of CPU and memory shares, reservations and limits #3607
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
Welcome @tommasopozzetti! |
|
Hi @tommasopozzetti. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
This seems incredibly useful, thank you for doing this! Any chance you could do something similar for memory reservations as well? |
|
I’d be happy to add to the PR similar logic for memory reservation and shares as well as potentially memory pinning I was hoping to first get a glance from a maintainer to see if this approach is reasonable given it’s my first contribution to this project! |
apis/v1beta1/types.go
Outdated
| // CPUReservationMhz is the amount of CPU in MHz that is guaranteed available to the virtual machine. | ||
| // Defaults to the eponymous property value in the template from which the | ||
| // virtual machine is cloned. | ||
| // +optional | ||
| CPUReservationMhz int64 `json:"cpuReservationMhz,omitempty"` | ||
| // CPUShares are a relative priority to other virtual machines used in case of resource contention. | ||
| // Defaults to the eponymous property value in the template from which the | ||
| // virtual machine is cloned. | ||
| // +optional | ||
| CPUShares int32 `json:"cpuShares,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to have something like:
resources:
reservation:
cpu: ...
memory: ...
shares:
...Or maybe this should be modelled in the k8s wordings which has limits and requests? (might not match the things this PR currently sets).
This is e.g. done by the vm-operator APIs. However, vm-operator does not use shares.
Could someone research what the benefits are of setting shares? And should we also consider CPU Limit to be set?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@chrischdi thanks for the review!
In terms of shares vs reservations, shares are a relative measure of prioritization while reservations are an absolute one. A VM with 2Ghz set for reservation will be guaranteed that even under host contention, it will always have 2Ghz of cpu power available to it. The vSphere admission controller will prevent a VM to power on if the total sum of reservations for VMs on a given host exceeds the total available cpu power of that host (no overprovisioning).
Shares on the other hand are just meaningful relative to shares of other VMs on that host. The host can be over provisioned and if it comes under cpu contention, VMs will be prioritized for cpu time relative to each other depending on their shares. So if a host has 3 VMs, one with 6000 shares, one with 3000 and one with 1000, if the host comes under cpu contention, the first will get 60% of cpu time, the second 30% and the third 10%. Each VM normally gets shares assigned by default proportional to its number of vCPUs but it is very useful to be able to tune that at will.
In terms of using cpu/memory limits, I have never had to implement these but they essentially would artificially cause the same effects as if the underlying host was under resource contention even if it is not, when the VM reaches said limit. More detailed info here. I'd be happy to add to this PR the limit as well as an optional configurable if desired.
Finally, in terms of the syntax, I'm open to suggestions! I personally feel like using the same syntax as standard Kubernetes containers might be misleading since the practical implementation of using reservations, shares and limits on VMs is very different than memory and cpu requests and limits for k8s pods.
I was going for a more flat mapping similar to the other properties that matches with the VM options and would look like
cpuReservationMhz: xxx
cpuShares: xxx
cpuLimitMhz: xxx
memoryReservationMB: xxx
memoryShares: xxx
memoryLimitMB: xxx
reserveAllMemory: falsebut, if preferred, we could also go for something nested like
resourceManagement:
cpu:
reservationMhz:
shares:
limitMhz:
memory:
reservationMB:
shares:
limitMB:
reserveAll:or similar
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to takeover the definition how it is in vm-operator to have a similar API.
Which comes down to:
resources:
requests: # --> reservations
cpu: ... # in mhz, documented on the godoc
memory: ...
limits: # --> limits
cpu: ...
memory: ...
And also use the types which are common for kubernetes.
The fields should have proper description to what this fields map at the end.
For shares: see my other comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: having the unit in the name does not make sense to me. At least not for memory, and I think also not for CPU.
If we have a proper godoc that explains that this field is in hz, 2GhZ should map to e.g. "2Gi".
|
Hi @chrischdi, following up on your comment, do you have any thoughts on my reply? |
|
@chrischdi or @sbueringer do you have any opinions on the above discussion? I'd love to do any edits that make sense to push this forward! This feature is one that we really need |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First of all: sorry for the long run on this and thanks for reminding me.
I'm fine with adding api and implementation which properly sets reservation (= requests in kubernetes) and limits for CPU and memory.
I'd like to again think about the use-case of setting shares for a VM. I want to understand the use-case so we don't add an api which won't really get used at the later stage.
VM-Operator does not have them configurable. And I think there's a reason.
If I get it right, setting shares only take effect if there is a lack of resources affecting all the VMs of the vSphere host / cluster / resourcepool (depending on setup).
So I think there's no way, or it is pretty hard, to set shares to a value that makes sense across the whole infrastructure.
The real solution here should be adding more physical capacity.
Shares can be used to prioritize resource availability for VMs at the time of contention.
@tommasopozzetti : I'd like to better understand the use-case you have for setting shares?
Maybe @akutz has some thoughts here.
@vr4manta maybe you have some thoughts here too?
apis/v1beta1/types.go
Outdated
| // CPUReservationMhz is the amount of CPU in MHz that is guaranteed available to the virtual machine. | ||
| // Defaults to the eponymous property value in the template from which the | ||
| // virtual machine is cloned. | ||
| // +optional | ||
| CPUReservationMhz int64 `json:"cpuReservationMhz,omitempty"` | ||
| // CPUShares are a relative priority to other virtual machines used in case of resource contention. | ||
| // Defaults to the eponymous property value in the template from which the | ||
| // virtual machine is cloned. | ||
| // +optional | ||
| CPUShares int32 `json:"cpuShares,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to takeover the definition how it is in vm-operator to have a similar API.
Which comes down to:
resources:
requests: # --> reservations
cpu: ... # in mhz, documented on the godoc
memory: ...
limits: # --> limits
cpu: ...
memory: ...
And also use the types which are common for kubernetes.
The fields should have proper description to what this fields map at the end.
For shares: see my other comment.
|
@chrischdi thank you for your review! I will make the changes to the PR to implement the syntax you suggested to match vm-operator. First and foremost, shares are in use all the time out of the box even if you do not set them. vSphere will assign automatically shares to every VM proportional to the chosen priority of the VM (or the default one for the resource pool if not defined) and the number of cores of the VM. So shares will be used regardless of our implementation here. My proposed addition just gives more optional control over them allowing to map the corresponding available api to configure a custom value of shares if desired. We use this heavily. Shares are the only way to properly set relative priorities among VMs to distinguish among more and less critical workloads and in our case, we have more "classes" of workloads then the simple default 3 levels of priorities available so the ability to customize them is imperative. We have actually been recommended to use this mechanism from VMware engineers so I assume these are used and valuable. While I definitely agree that as you run out of capacity, adding capacity is always the best solution, it is not one always readily available and, when running over provisioned (which we are significantly to make efficient use of resources), spikes will always at times cause temporary contention. That contention must somehow be resolved and the way vSphere does that is giving relative cpu time based on the shares. Only using reservations and limits does not address this properly. It just sets a guaranteed minimum and a ceiling maximum for the resource but it does not give any direction to the system of how to prioritize in between that min and max among VMs. I can start working on the edits to have the following syntax and ensure types and comments match the vm-operator with the shares addition, but please do let me know in the meantime if you have further thoughts about the shares discussion! resources:
requests: # --> reservations
cpu: ... # in mhz, documented on the godoc
memory: ...
limits: # --> limits
cpu: ...
memory: ...
shares:
cpu: ...
memory: ... |
|
👍 would love this - both the reservations/limits and also shares |
|
Sounds reasonable, thanks for explaining! Also the example Looks good to me 👍 Let me know once you updated the PR |
3d5a4db to
47c75b7
Compare
|
@chrischdi I have updated the PR to follow the proposed syntax! |
|
I'll also try to take a look soon /ok-to-test |
ab91902 to
64df85a
Compare
|
Thanks @sbueringer! |
| // Set CPU reservations, limits and shares if specified | ||
| cpuAllocation := types.ResourceAllocationInfo{} | ||
| if !vmCtx.VSphereVM.Spec.Resources.Requests.Cpu.IsZero() { | ||
| cpuReservationMhz := int64(math.Ceil(float64(vmCtx.VSphereVM.Spec.Resources.Requests.Cpu.Value()) / float64(1000000))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add helper funcs for convertQuantityToMhz and convertQuantityToMB
| } | ||
| cpuAllocation.Shares = ptr.To(cpuShares) | ||
| } | ||
| spec.Config.CpuAllocation = ptr.To(cpuAllocation) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there any effects if this is set empty vs nil (which it was before?)
I'd prefer to not change the existing behavior.
So if no requests, limits or shares are set, we should not set CPUAllocation to an empty struct. Same below.
|
conversion also need to be fixed to be round-tripable. See e.g. what we do for AdditionalDisksGiB: https://github.com/openshift/cluster-api-provider-vsphere/blob/393a16983d02d5a2b254e4182002f70329f9dd8f/apis/v1alpha4/vspheremachine_conversion.go#L39 |
4c59fe7 to
6933d96
Compare
|
@chrischdi thanks for he input! The only comment I have is on the quantities and the validation. Rather then hard failing if the quantity is not set as a precise multiple of Mhz or MiB, which would be an atypical behavior for There look to still be a check failing and it is unclear to me if it's still related to the conversion webhooks or unrelated. The logs don't make it very easy to understand what those checks are doing and why they are failing. I see several errors related to sessions to vCenter and parsing of the APIVersion and Kind, both things that don't have anything to do with the contents of this PR. Any further guidance here is definitely appreciated! |
That's fine for me. The failing tests are fuzz tests. They try to ensure that the conversions are round-tripable. So the failing tests here are: v1beta1.VSphereVM -> v1alpha(3/4).VSphereVM -> v1beta1.VSphereVM And that causes a diff. Same for VSphereMachineTemplate. We also need to add the conversion change for VSphereVM and VSphereMachineTemplate, because they also get the additional fields. |
6933d96 to
89944f7
Compare
|
@chrischdi thanks for the pointers! |
|
@chrischdi circling back here to see if there is anything else I can do! |
|
/retest |
|
@chrischdi yay! Looks like all tests passed! Would it be possible to get a final review and potentially get this in to be included in the next release? We are very much looking forward to using this feature! |
sbueringer
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thx, just one nit from my side
…ns and limits Signed-off-by: Tommaso <[email protected]>
89944f7 to
8d3bbd4
Compare
|
/retest |
|
@sbueringer Thank you! I corrected the CPU field as requested! |
|
/test pull-cluster-api-provider-vsphere-e2e-govmomi-blocking-main |
Thank you very much!
Yup, let's get this merged early next week, so it's part of the next release. /assign @chrischdi for a final review |
What this PR does / why we need it:
This PR adds flags to optionally customize CPU shares and reservations for cloned VMs as part of the vSphereMachine spec