Skip to content

Comments

📖 Update autoscaling from zero enhancement proposal with node labels and taints configuration clarification#13308

Merged
k8s-ci-robot merged 1 commit intokubernetes-sigs:mainfrom
LiangquanLi930:from-zero
Feb 13, 2026
Merged

📖 Update autoscaling from zero enhancement proposal with node labels and taints configuration clarification#13308
k8s-ci-robot merged 1 commit intokubernetes-sigs:mainfrom
LiangquanLi930:from-zero

Conversation

@LiangquanLi930
Copy link
Contributor

What this PR does / why we need it:

Reorganize the node labels and taints section to document two configuration mechanisms:

  1. CAPI's native metadata propagation using node.cluster.x-k8s.io/ prefix in MachineSet/MachineDeployment labels
  2. Capacity annotations for explicit control or overrides

Add precedence rules and examples for each mechanism to improve clarity for users implementing autoscaling from zero.

Which issue(s) this PR fixes :
Related to kubernetes/autoscaler#9189

/area provider/core

@k8s-ci-robot k8s-ci-robot added the area/provider/core Issues or PRs related to the core provider label Feb 5, 2026
@k8s-ci-robot
Copy link
Contributor

Welcome @LiangquanLi930!

It looks like this is your first PR to kubernetes-sigs/cluster-api 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/cluster-api has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Feb 5, 2026
@k8s-ci-robot
Copy link
Contributor

Hi @LiangquanLi930. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Feb 5, 2026
@LiangquanLi930
Copy link
Contributor Author

@elmiko PTAL, Thanks!

Copy link
Contributor

@elmiko elmiko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @LiangquanLi930 !

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 6, 2026
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

DetailsGit tree hash: e6671f4163dcb93c1335d0b1042228e6d2aeeaa1

@elmiko
Copy link
Contributor

elmiko commented Feb 6, 2026

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Feb 6, 2026
@LiangquanLi930
Copy link
Contributor Author

@chrischdi Hi, When you have time, could you help to review this PR? Thanks!

capacity.cluster-autoscaler.kubernetes.io/taints: "key1=value1:NoSchedule,key2=value2:NoExecute"
```

If the `capacity.cluster-autoscaler.kubernetes.io/labels` annotation specifies a label that would otherwise be
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@elmiko @LiangquanLi930 I think it's not correct to delete this explanation. It's not covered by the new one

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sbueringer Thanks for the review!

The deleted text was explaining that the capacity.cluster-autoscaler.kubernetes.io/labels annotation
takes precedence over labels generated from the Infrastructure Machine Template status field. This is
already covered by the general rule in the "Implementation Details" section:

"These methods are mutually exclusive, and the annotations will take preference when specified."

The scope of this "Node Labels and Taints" subsection is specifically about the two mechanisms users can
use to configure node labels for scale-from-zero:

  1. MachineSet/MachineDeployment labels with node.cluster.x-k8s.io/ prefix
  2. capacity.cluster-autoscaler.kubernetes.io/labels annotation

The status.nodeInfo is not a user-facing label configuration mechanism — it's an infrastructure provider
mechanism for reporting node metadata, which is already documented in the "Infrastructure Machine
Template Status Updates" section above. So the precedence list here intentionally only covers the two
user-facing label sources.

Copy link
Contributor

@elmiko elmiko Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think the previous text might have also contained something that isn't quite true:

annotation specifies a label that would otherwise be generated from the fields in the status field of the Machine Template, the autoscaler will prioritize and use the label defined in the annotation.

i don't believe we have labels that could originate from the Machine Template status field. i think this text might have mixed with the capacity information text.

regardless, the new section looks good to me about the priority.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The deleted text was explaining that the capacity.cluster-autoscaler.kubernetes.io/labels annotation
takes precedence over labels generated from the Infrastructure Machine Template status field. This is
already covered by the general rule in the "Implementation Details" section:

Okay got that point. I think this sentence above is not very clear to be honest:

There are 2 methods described for informing the cluster autoscaler about the resource needs of the nodes

So "resource needs of the nodes" is referring to to Capacity and NodeInfo? I think calling NodeInfo "resource needs" is misleading. I think we should rephrase this above, e.g. to

There are 2 methods described for informing the cluster autoscaler about the capacity, operating system and architecture of the
nodes in each node group

There's one point that I'm now entirely confused about. Is it possible to declare the operating system and architecture via annotations, or not? If yes, can we please expand the example in l.272++

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sbueringer In my view, the information that determines scale-from-zero behavior includes:

  1. Capacity — CPU, memory, GPU, etc
  2. NodeInfo — architecture, operating system
  3. Labels — user-defined node labels
  4. Taints — user-defined node taints

The section I modified ("Node Labels and Taints") only covers 3 and 4, which are user-facing configurations. That's why the
precedence list there only includes the two user-facing label sources.

Regarding your question about "resource needs" — I agree the wording is not precise. I can update it to something more general like
"the properties of the nodes" to cover all four categories above.

For declaring OS and architecture via annotations: yes, users can specify them through
capacity.cluster-autoscaler.kubernetes.io/labels, e.g.:
capacity.cluster-autoscaler.kubernetes.io/labels: "kubernetes.io/arch=amd64,kubernetes.io/os=linux"
I can add this to the annotation example at l.272 to make it explicit.

WDYT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds perfect, thank you!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @sbueringer , i think making this clearer is a good improvement.

…d taints configuration clarification

Reorganize the node labels and taints section to document two
configuration mechanisms:

1. CAPI's native metadata propagation using node.cluster.x-k8s.io/
   prefix in MachineSet/MachineDeployment labels
2. Capacity annotations for explicit control or overrides

Add precedence rules and examples for each mechanism to improve
clarity for users implementing autoscaling from zero.

Signed-off-by: Liangquan Li <liangli@redhat.com>
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 13, 2026
@k8s-ci-robot k8s-ci-robot requested a review from elmiko February 13, 2026 07:30
@LiangquanLi930
Copy link
Contributor Author

@sbueringer @elmiko PTAL, thanks!

@sbueringer
Copy link
Member

Thank you very much. That's very clear now! :)

/approve

/assign @elmiko

P.S. @elmiko We might want to follow-up eventually to also add Machine taints to the **Node Labels and Taints** section. The feature is still behind a feature gate, but probably worth mentioning already (xref: #12908). Does autoscaler already read the field to infer taints?

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: elmiko, sbueringer

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 13, 2026
@elmiko
Copy link
Contributor

elmiko commented Feb 13, 2026

P.S. @elmiko We might want to follow-up eventually to also add Machine taints to the Node Labels and Taints section. The feature is still behind a feature gate, but probably worth mentioning already (xref: #12908).

this is a great callout @sbueringer , i will either just open a PR or create an issue to track.

Does autoscaler already read the field to infer taints?

this needs to be added to the autoscaler. i'll make an issue to track it, but it should be fairly straightforward to add it and it won't be harmful if autoscaler has the ability to read the field before it's gone stable in clusterapi.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 13, 2026
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

DetailsGit tree hash: 2ee367f215b40e7c08c8ecefb966e6eddaadc7d5

@k8s-ci-robot k8s-ci-robot merged commit 575e617 into kubernetes-sigs:main Feb 13, 2026
16 checks passed
@elmiko
Copy link
Contributor

elmiko commented Feb 17, 2026

created kubernetes/autoscaler#9239

@sbueringer
Copy link
Member

sbueringer commented Feb 17, 2026

Thank you! (:follow:)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. area/provider/core Issues or PRs related to the core provider cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants