📖 Update autoscaling from zero enhancement proposal with node labels and taints configuration clarification#13308
Conversation
|
Welcome @LiangquanLi930! |
|
Hi @LiangquanLi930. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
@elmiko PTAL, Thanks! |
elmiko
left a comment
There was a problem hiding this comment.
thanks @LiangquanLi930 !
/lgtm
/approve
|
LGTM label has been added. DetailsGit tree hash: e6671f4163dcb93c1335d0b1042228e6d2aeeaa1 |
|
/ok-to-test |
|
@chrischdi Hi, When you have time, could you help to review this PR? Thanks! |
| capacity.cluster-autoscaler.kubernetes.io/taints: "key1=value1:NoSchedule,key2=value2:NoExecute" | ||
| ``` | ||
|
|
||
| If the `capacity.cluster-autoscaler.kubernetes.io/labels` annotation specifies a label that would otherwise be |
There was a problem hiding this comment.
@elmiko @LiangquanLi930 I think it's not correct to delete this explanation. It's not covered by the new one
There was a problem hiding this comment.
@sbueringer Thanks for the review!
The deleted text was explaining that the capacity.cluster-autoscaler.kubernetes.io/labels annotation
takes precedence over labels generated from the Infrastructure Machine Template status field. This is
already covered by the general rule in the "Implementation Details" section:
"These methods are mutually exclusive, and the annotations will take preference when specified."
The scope of this "Node Labels and Taints" subsection is specifically about the two mechanisms users can
use to configure node labels for scale-from-zero:
- MachineSet/MachineDeployment labels with node.cluster.x-k8s.io/ prefix
- capacity.cluster-autoscaler.kubernetes.io/labels annotation
The status.nodeInfo is not a user-facing label configuration mechanism — it's an infrastructure provider
mechanism for reporting node metadata, which is already documented in the "Infrastructure Machine
Template Status Updates" section above. So the precedence list here intentionally only covers the two
user-facing label sources.
There was a problem hiding this comment.
i think the previous text might have also contained something that isn't quite true:
annotation specifies a label that would otherwise be generated from the fields in the
statusfield of the Machine Template, the autoscaler will prioritize and use the label defined in the annotation.
i don't believe we have labels that could originate from the Machine Template status field. i think this text might have mixed with the capacity information text.
regardless, the new section looks good to me about the priority.
There was a problem hiding this comment.
The deleted text was explaining that the capacity.cluster-autoscaler.kubernetes.io/labels annotation
takes precedence over labels generated from the Infrastructure Machine Template status field. This is
already covered by the general rule in the "Implementation Details" section:
Okay got that point. I think this sentence above is not very clear to be honest:
There are 2 methods described for informing the cluster autoscaler about the resource needs of the nodes
So "resource needs of the nodes" is referring to to Capacity and NodeInfo? I think calling NodeInfo "resource needs" is misleading. I think we should rephrase this above, e.g. to
There are 2 methods described for informing the cluster autoscaler about the capacity, operating system and architecture of the
nodes in each node group
There's one point that I'm now entirely confused about. Is it possible to declare the operating system and architecture via annotations, or not? If yes, can we please expand the example in l.272++
There was a problem hiding this comment.
@sbueringer In my view, the information that determines scale-from-zero behavior includes:
- Capacity — CPU, memory, GPU, etc
- NodeInfo — architecture, operating system
- Labels — user-defined node labels
- Taints — user-defined node taints
The section I modified ("Node Labels and Taints") only covers 3 and 4, which are user-facing configurations. That's why the
precedence list there only includes the two user-facing label sources.
Regarding your question about "resource needs" — I agree the wording is not precise. I can update it to something more general like
"the properties of the nodes" to cover all four categories above.
For declaring OS and architecture via annotations: yes, users can specify them through
capacity.cluster-autoscaler.kubernetes.io/labels, e.g.:
capacity.cluster-autoscaler.kubernetes.io/labels: "kubernetes.io/arch=amd64,kubernetes.io/os=linux"
I can add this to the annotation example at l.272 to make it explicit.
WDYT?
There was a problem hiding this comment.
thanks @sbueringer , i think making this clearer is a good improvement.
…d taints configuration clarification Reorganize the node labels and taints section to document two configuration mechanisms: 1. CAPI's native metadata propagation using node.cluster.x-k8s.io/ prefix in MachineSet/MachineDeployment labels 2. Capacity annotations for explicit control or overrides Add precedence rules and examples for each mechanism to improve clarity for users implementing autoscaling from zero. Signed-off-by: Liangquan Li <liangli@redhat.com>
71b8bda to
d00d6e8
Compare
|
@sbueringer @elmiko PTAL, thanks! |
|
Thank you very much. That's very clear now! :) /approve /assign @elmiko P.S. @elmiko We might want to follow-up eventually to also add Machine taints to the |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: elmiko, sbueringer The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
this is a great callout @sbueringer , i will either just open a PR or create an issue to track.
this needs to be added to the autoscaler. i'll make an issue to track it, but it should be fairly straightforward to add it and it won't be harmful if autoscaler has the ability to read the field before it's gone stable in clusterapi. /lgtm |
|
LGTM label has been added. DetailsGit tree hash: 2ee367f215b40e7c08c8ecefb966e6eddaadc7d5 |
|
created kubernetes/autoscaler#9239 |
|
Thank you! (:follow:) |
What this PR does / why we need it:
Reorganize the node labels and taints section to document two configuration mechanisms:
Add precedence rules and examples for each mechanism to improve clarity for users implementing autoscaling from zero.
Which issue(s) this PR fixes :
Related to kubernetes/autoscaler#9189
/area provider/core