Ability to Scale Karpenter Provisioned Nodes To 0 On Demand Or By Schedule During Off Hours #2850

ronberna · 2024-04-09T03:37:05Z

ronberna
Apr 9, 2024

Description

What problem are you trying to solve?
We've recently begun the migration from using ASG's (AutoScaling Groups) and CAS (Cluster AutoScaler) to Karpenter. With ASG's, as part of cost saving measures, our EKS clusters are scaled down during off hours and weekends in lower environments, and then scaled back up during office hours. This was performed by running a lambda at a scheduled time to set the min/max/desired settings of the ASG to 0. The current values of the min/max/desired settings before the update to 0 are captured and stored in ssm. For the scale up, the lambda reads this ssm parameter to set the ASG min/max/desired values. With Karpenter, this is not possible.

As a workaround, we have a lambda that will patch the cpu limit of the nodepool and set it to 0 so that no new Karpenter nodes will be provisioned. The lambda will then take care of deleting the previously provisioned Karpenter nodes. We have a mix of workloads running in the cluster with some using HPA and some not, so trying to scale down all of the deployments to remove the Karpenter provisioned nodes will not work. It has also been suggested to delete the nodepool and reapply it via a cronjob. This option will also not work since some of our clusters are in a controlled environment.

The ask here is to introduce a feature in Karpenter that will handle scaling down/up all Karpenter provisioned nodes on-demand via a flag or possibly with the update of the cpu limit, Karpenter will not provision any new nodes and will also clean up previously provisioned nodes without having to introduce additional cronjobs, lambdas, or deleting nodepools.

How important is this feature to you?
This feature is important as it will help with AWS cost savings by not having EC2 instances running during off hours and not having to add additional components (lambdas, cronjobs, etc...) to aid with scaling Karpenter provisioned instances.

Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
If you are interested in working on this issue or have submitted a pull request, please leave a comment

jonathan-innis · 2024-04-22T06:16:03Z

jonathan-innis
Apr 22, 2024
Maintainer

Karpenter will not provision any new nodes and will also clean up previously provisioned nodes without having to introduce additional cronjobs, lambdas, or deleting nodepools

We've had some conversation about this among the maintainers. IMO, this features basically comes down to -- should we consolidate based on limits? If you apply a more restrictive limit to your NodePool, does that mean that you are implying that the NodePool should deprovision nodes until it gets back to complying with its limits.

IMO: This strikes me as an intuitive desired state mechanism -- you have set a new desired state on your NodePool -- implying that you no longer support a given capacity. Now comes the more difficult question: Should Karpenter force application pods off of your nodes unsafely if you have enforced stricter limits on your NodePool and those pods have nowhere else to schedule? This breaks current assumptions that we have around the safety of disruption -- that is, if we disrupt a node (unless it is due to spot interruption), we assume that we are doing so assuming that we can reschedule the existing pods on the node onto some other capacity (either existing or new). This feature would have us force delete pods regardless of whether they can schedule or not -- which starts to look a bit scary.

This option will also not work since some of our clusters are in a controlled environment

I know you mentioned that you can't delete the NodePool to spin down nodes but I'm curious what you mean by "controlled environment". Wouldn't updating the limits also cause similar changes to your cluster that I assume would also be subject to this "controlled environment?"

0 replies

ronberna · 2024-04-22T19:03:53Z

ronberna
Apr 22, 2024
Author

If you apply a more restrictive limit to your NodePool, does that mean that you are implying that the NodePool should deprovision nodes until it gets back to complying with its limits.

Yes, I believe this is what is being implied. If the cpu limit is set to 0, that would mean that we want to deprovision existing nodes, similar to setting the min/max/desired values to 0 for an ASG. Even if something similar to an ASG Scheduled Action was introduced to where I can create a configuration inside the NodePool to deprovision existing nodes and not spin up any additional nodes.

A flaw that we've uncovered with our current approach of using a lambda to patch the cpu limit to 0 and then delete existing Karpenter provisioned nodes is that if a node was provisioned right before the cpu limit was set and is now in the "NotReady" state, this node will not get cleaned up as it is not yet recognized as an active node and will remain running. We're having to come up with a solution to rerun the lambda multiple times to make sure nodes get cleaned up if this happens. We will not only have to delete the finalizer from the node before deleting from the cluster, but we will also have to terminate the node in AWS as a kubectl delete node will delete it from the cluster, but will not delete it from AWS. As long as the node is still in AWS, Karpenter will not provision a new node.

Should Karpenter force application pods off of your nodes unsafely if you have enforced stricter limits on your NodePool and those pods have nowhere else to schedule?

Yes. This is the behavior that currently happens for ASG's. Our pods will stay in a Pending state until the next workday when the ASG min/max/desired settings are updated to their previous work hour values. With no nodes running during non-work hours our savings are pretty significant.

I know you mentioned that you can't delete the NodePool to spin down nodes but I'm curious what you mean by "controlled environment".

By controlled environment we mean that certain changes to the environment will require going through change control (testing the change, creating change request, verifying test results, getting approvals to implement said request, implementing the change, verifying the change). Doing this daily is not feasible IMO. Yes, technically patching the limit is subject to the "controlled environment", but it's easier based on our current process to patch the cpu limit with a scheduled lambda function as opposed to deleting an entire k8s resource and having to go through the steps mentioned above in order to kick off a pipeline to get the resource re-applied. That's why the ask here is to have this feature built into Karpenter. If designed properly, IMO, this would be a huge win.

0 replies

cp1408 · 2024-05-24T18:52:15Z

cp1408
May 24, 2024

You can use below yaml to delete and create Karpenter nodes, Logic is to delete the nodepool on friday and re-create on sunday. i have tested this in non-prod and it is running without any issues from a while.

---
apiVersion: v1
kind: Namespace
metadata:
  labels:
    kubernetes.io/metadata.name: karpenter-cron
  name: karpenter-cron
---
apiVersion: v1
kind: ServiceAccount
metadata:
  labels:
    app: karpenter-cron
  name: karpenter-cron
  namespace: karpenter-cron
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: karpenter-cron
rules:
  - apiGroups: [""]
    resources: ["nodes"]
    verbs: ["get", "list", "watch", "describe"]
    #
  - apiGroups: ["karpenter.sh"]                 
    resources: ["nodepools"]
    verbs: ["get", "list", "watch", "create", "update", "patch", "delete", "describe"]
    #
  - apiGroups: ["batch"]                 
    resources: ["jobs", "cronjobs"]
    verbs: ["get", "list", "watch", "create", "describe"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: karpenter-cron
subjects:
- kind: ServiceAccount
  name: karpenter-cron
  namespace: karpenter-cron
roleRef:
  kind: ClusterRole
  name: karpenter-cron
  apiGroup: rbac.authorization.k8s.io
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: karpenter-cron-cm
  namespace: karpenter-cron
data:
  karpenter-nodepool.yaml: |
    apiVersion: karpenter.sh/v1beta1
    kind: NodePool
    metadata:
      annotations:
      name: default
    spec:
      disruption:
        budgets:
        - nodes: 10%
        consolidationPolicy: WhenUnderutilized
        expireAfter: 720h
      limits:
        cpu: 1000
      template:
        spec:
          nodeClassRef:
            name: default
          requirements:
          - key: kubernetes.io/arch
            operator: In
            values:
            - amd64
          - key: kubernetes.io/os
            operator: In
            values:
            - linux
          - key: karpenter.k8s.aws/instance-category
            operator: In
            values:
            - t
            - r
            - m
            - c
          - key: karpenter.k8s.aws/instance-generation
            operator: Gt
            values:
            - "2"
          - key: karpenter.sh/capacity-type
            operator: In
            values:
            - on-demand
          - key: karpenter.k8s.aws/instance-cpu
            operator: In
            values:
            - "4"
            - "8"
            - "16"
---
apiVersion: batch/v1
kind: CronJob
metadata:
  name: karpenter-nodepool-delete-cron
  namespace: karpenter-cron
spec:
  schedule: "55 17 * * FRI"
  startingDeadlineSeconds: 20
  successfulJobsHistoryLimit: 1
  suspend: false
  jobTemplate:
    spec:
      completions: 1
      ttlSecondsAfterFinished: 10
      parallelism: 1
      completions: 1
      template:
        spec:
          containers:
          - name: karpenter-scale
            image: bitnami/kubectl:latest
            command:
            - /bin/bash
            - -c
            - |
              echo "List all the karpneter nodes"
              kubectl get nodes -l karpenter.sh/nodepool
              echo "List nodepool"
              kubectl get nodepool
              echo "Deleting NodePool"
              kubectl delete nodepool default
              sleep 5s
              echo "List all the karpneter nodes"
              kubectl get nodepool -A
              kubectl get nodes -l karpenter.sh/nodepool
              echo "script executed"
              echo "completed"
          restartPolicy: OnFailure
          serviceAccountName: karpenter-cron
---
apiVersion: batch/v1
kind: CronJob
metadata:
  name: karpenter-nodepool-create-cron
  namespace: karpenter-cron
spec:
  schedule: "55 17 * * SUN"
  startingDeadlineSeconds: 20
  successfulJobsHistoryLimit: 1
  suspend: false
  jobTemplate:
    spec:
      completions: 1
      ttlSecondsAfterFinished: 10
      parallelism: 1
      completions: 1
      template:
        spec:
          containers:
          - name: karpenter-scale
            image: bitnami/kubectl:latest
            command:
            - /bin/bash
            - -c
            - |
              echo "creating nodepool"
              kubectl apply -f /home/karpenter-nodepool.yaml
              echo "nodepool created"
              kubectl get nodepool -o yaml
              sleep 5s
            volumeMounts:
            - name: karpenter-nodepool
              mountPath: /home
          restartPolicy: OnFailure
          serviceAccountName: karpenter-cron
          volumes:
            - name: karpenter-nodepool
              configMap:
                name: karpenter-cron-cm

0 replies

ronberna · 2024-05-28T19:08:00Z

ronberna
May 28, 2024
Author

Unfortunately deleting and reapplying nodepool resources is not an option for us. What would be ideal, IMO, would to have something like the disruption budget schedule that we can set that would basically scale down all instances provisioned by a given nodepool

0 replies

felipewnp · 2024-06-11T21:05:00Z

felipewnp
Jun 11, 2024

I've stumbled upon this issue after doing the same thing @cp1408 suggested.

My cronjob does:

patch the nodepool limits.cpu to 0
wait 20 seconds
taint nodepool nodes with noExecute
wait 20 seconds
force delete all pods, not belonging to a daemonset, inside the nodepool nodes
force delete all nodepool nodes

I think the ideal scenario is something like this:

Set karpenter nodepool limits.cpu to 0
There would be a flag like driftConsolidation: soft / hard (to address what @jonathan-innis said:)

Should Karpenter force application pods off of your nodes unsafely if you have enforced stricter limits on your NodePool and those pods have nowhere else to schedule?

Karpenter starts a soft / hard drift consolidation

0 replies

alanliangdev · 2024-08-10T00:10:41Z

alanliangdev
Aug 10, 2024

Has anyone stumbled across a decent solution for partially shutting down Karpenter provisioned nodes when the definitions of the Karpenter and it's NodePools are defined in a GitOps tool like Argo CD which has self-healing and automated sync enabled. If I delete a NodePool, Argo CD will re-sync/re-create the NodePool object as it is defined in a GitHub repository.

One scenario we have been able to consider is terminating Argo CD prior to deleting the NodePool or even patching the NodePools limits.cpu to 0. This however doesn't come without flaws, as some of our consumers require granular/partial shutdown of their NodePools. For example, I want to shutdown NodePool A which is used by team A, but keep NodePool B up which is used by team B who need their nodes up. By terminating Argo CD, we effectively halt all ability for team B to perform GitOps related changes onto the cluster. But if we keep Argo CD up, then it will reconcile NodePool A which is not desired.

Another option would be to automate commits to our upstream GitHub repositories to comment out the NodePool specification, but was hoping to avoid this, as it will flood our GitHub repository commits with daily shutdown and startup commits.

Finally, we considered scaling down Deployments/StatefulSets/Jobs to allow for Karpenter to automatically shutdown NodePools, but again, given majority of the workloads are deployed via Argo CD which will reconcile the replica state (as most of our consumers define a hard-coded replica count in their GitHub repository). We would be left with the same problem above where we would either have to terminate Argo CD so it doesn't re-sync the workloads, or force all of our users to no longer define replica count in their workloads and rely on things like HPAs.

The most intuitive option seems to be directly committing changes to the GitHub repository that Argo CD watches but was wondering if anyone has faced similar issues and have any suggestions for alternative approaches to enable granular shutdown of Karpenter provisioned nodes.

0 replies

Roberdvs · 2024-08-12T09:57:14Z

Roberdvs
Aug 12, 2024

Has anyone stumbled across a decent solution for partially shutting down Karpenter provisioned nodes when the definitions of the Karpenter and it's NodePools are defined in a GitOps tool like Argo CD which has self-healing and automated sync enabled. If I delete a NodePool, Argo CD will re-sync/re-create the NodePool object as it is defined in a GitHub repository.

One scenario we have been able to consider is terminating Argo CD prior to deleting the NodePool or even patching the NodePools limits.cpu to 0. This however doesn't come without flaws, as some of our consumers require granular/partial shutdown of their NodePools. For example, I want to shutdown NodePool A which is used by team A, but keep NodePool B up which is used by team B who need their nodes up. By terminating Argo CD, we effectively halt all ability for team B to perform GitOps related changes onto the cluster. But if we keep Argo CD up, then it will reconcile NodePool A which is not desired.

Another option would be to automate commits to our upstream GitHub repositories to comment out the NodePool specification, but was hoping to avoid this, as it will flood our GitHub repository commits with daily shutdown and startup commits.

Finally, we considered scaling down Deployments/StatefulSets/Jobs to allow for Karpenter to automatically shutdown NodePools, but again, given majority of the workloads are deployed via Argo CD which will reconcile the replica state (as most of our consumers define a hard-coded replica count in their GitHub repository). We would be left with the same problem above where we would either have to terminate Argo CD so it doesn't re-sync the workloads, or force all of our users to no longer define replica count in their workloads and rely on things like HPAs.

The most intuitive option seems to be directly committing changes to the GitHub repository that Argo CD watches but was wondering if anyone has faced similar issues and have any suggestions for alternative approaches to enable granular shutdown of Karpenter provisioned nodes.

Check ArgoCD's sync windows.

We're currently using them to avoid the GitOps reconciliation when scaling down the Deployments off-hours like you mention, but you could also use them to prevent NodePool recreation if you handle those via GitOps.

0 replies

Pilotindream · 2024-08-21T16:20:44Z

Pilotindream
Aug 21, 2024

Hello @ronberna,
May your share some example of the Lambda that you mention here:
"As a workaround, we have a lambda that will patch the cpu limit of the nodepool and set it to 0 so that no new Karpenter nodes will be provisioned. The lambda will then take care of deleting the previously provisioned Karpenter nodes."

Am I right that it first set cpu on provisioner to 0 then delete nodes and then delete ec2 instances on AWS?

0 replies

felipewnp · 2024-08-21T23:20:18Z

felipewnp
Aug 21, 2024

@Pilotindream I can't say for @ronberna , but I do this as well and yes.

This is the right order.

I can bring you the shell script tomorrow.

In my case, I run it inside my kubernetes cluster as a cronjob, since I have a pair of nodes not managed by karpenter.

0 replies

Pilotindream · 2024-08-22T06:44:19Z

Pilotindream
Aug 22, 2024

@felipewnp, thanks for your reply. It would be nice if you can you me example of script. Will wait for your reply.
Also, how you dead with finalizer on nodes that you are deleting, since simple kubectl delete node is not working until I manually delete finalizer.
Thanks a lot again!

0 replies

barryib · 2024-08-22T07:52:32Z

barryib
Aug 22, 2024

@olsib wrote a great blog post on how to scale down to zero (for now) on staging environments https://aircall.io/blog/tech-team-stories/scale-karpenter-zero-optimize-costs/.

0 replies

felipewnp · 2024-08-22T17:16:03Z

felipewnp
Aug 22, 2024

@Pilotindream the link provided by @barryib is right, you can go from there!

0 replies

wa20221001 · 2024-09-05T06:01:06Z

wa20221001
Sep 5, 2024

@olsib wrote a great blog post on how to scale down to zero (for now) on staging environments https://aircall.io/blog/tech-team-stories/scale-karpenter-zero-optimize-costs/.

Thanks so much for this!, great read! I wonder how would you deal with making sure the CPU limit is in sync with git (especially if it is updated say every few days) ? We did a quick test and saw the cpu limit is never synced to git again (as expected). Is using namespace resource quotas enough in your use case?

0 replies

felipewnp · 2024-09-05T12:17:40Z

felipewnp
Sep 5, 2024

@olsib wrote a great blog post on how to scale down to zero (for now) on staging environments https://aircall.io/blog/tech-team-stories/scale-karpenter-zero-optimize-costs/.

Thanks so much for this!, great read! I wonder how would you deal with making sure the CPU limit is in sync with git (especially if it is updated say every few days) ? We did a quick test and saw the cpu limit is never synced to git again (as expected). Is using namespace resource quotas enough in your use case?

If you use gitops, in the script where you change the karpenter nodepool cpu limit, you could commit the changes to your git repo as well.

0 replies

olsib · 2024-09-05T12:33:10Z

olsib
Sep 5, 2024

@wa20221001 if you use ArgoCD you can use ignoreDifferencies as described in the blog post. Here is a snippet for you to review.

0 replies

k8s-triage-robot · 2025-03-17T17:31:46Z

k8s-triage-robot
Mar 17, 2025

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

0 replies

ronberna · 2025-03-18T02:09:35Z

ronberna
Mar 18, 2025
Author

/remove-lifecycle stale

0 replies

ian-axelrod-sl · 2025-05-28T18:38:35Z

ian-axelrod-sl
May 28, 2025

Could this be achieved by temporarily updating all NodePools to have a NoExecute taint so that every deployment gets kicked off and Karpenter will eventually delete the nodes? You'd need drift enabled, but other than that (and no non-daemonset resources that tolerate all taints) I think there are no requirements. That's how I am thinking of doing this until there is a native solution from Karpenter

0 replies

ronberna · 2025-05-28T19:51:49Z

ronberna
May 28, 2025
Author

We have resources with NoExecute tolerations, so this approach wouldn't work for our use case.

0 replies

channyein87 · 2025-05-28T22:21:53Z

channyein87
May 28, 2025

I humbly request that future comments consider how the Karpenter would support scaling to the zero natively rather than proposing a workaround solution. This approach would facilitate the maintenance of Karpenter and enhance its support for this natively.

0 replies

jurgenweber · 2025-06-11T04:11:51Z

jurgenweber
Jun 11, 2025

How do you solve the chicken and egg problem of karpenter now been unschedulable because it has no nodes to run on? How do you bring the cluster back?

0 replies

ArieLevs · 2025-06-11T06:02:16Z

ArieLevs
Jun 11, 2025

How do you solve the chicken and egg problem of karpenter now been unschedulable because it has no nodes to run on? How do you bring the cluster back?

Karpetner never manages 100% of nodes in the cluster, this "problem" is also relevant to initial cluster setup, how do you spin up nodes, if karpenter controller cannot reach running state.

Personally we use EKS and alway have 1 (auto scaled), very small, managed node, that is taint and can host around 8 system critical pods when karpenter controller is one of them

0 replies

edify42 · 2025-07-08T11:51:55Z

edify42
Jul 8, 2025

Found this issue after a bit of a rabbit hole looking to replace our AWS ASG scaling (cluster autoscaler) with karpenter. We want turn off our worker nodes to save on costs when people are sleeping in our non-prod environments.

As others note, this has to be external to k8s to avoid the chicken and egg issue (which AWS ASGs deal with via scheduled actions).

I'm probably going with an AWS Event bridge (or another scheduler) + AWS Lambda approach that is deployed in the same VPC and has a custom role mapped to the aws-auth configmap to update the nodepools of interest and set the CPU to 0. Then find all EC2 instances in the VPC that match the label from the nodepool and issue a termination. I think we'll still have a minimal EC2 node(s) deployed via an ASG which we turn off with our existing pattern.

0 replies

pachecoc · 2025-07-08T12:04:01Z

pachecoc
Jul 8, 2025

If you want to try a different approach: Instead of reducing the number of nodes you can downscale all of your workloads and let the karpenter acts on reducing the number of nodes.

Right now I am testing the Kube Downscaler here. It pause jobs, reduce the number of replicas of your workloads to 0 during your defined period or hours. It is a active project and it may be worth checking it out.

Regards

0 replies

felipewnp · 2025-07-08T14:12:50Z

felipewnp
Jul 8, 2025

Ive done this. I wrote a script that: - Set all karpenter nodepools to 0. - Force remove all pods on all karpenter nodes - Remove all karpenter nodes. Em ter., 8 de jul. de 2025, 08:52, Ted Kim ***@***.***> escreveu:

…

*edify42* left a comment (kubernetes-sigs/karpenter#1177) <#1177 (comment)> Found this issue after a bit of a rabbit hole looking to replace our AWS ASG scaling (cluster autoscaler) with karpenter. We want turn off our worker nodes to save on costs when people are sleeping in our non-prod environments. As others note, this has to be external to k8s to avoid the chicken and egg issue (which AWS ASGs deal with via scheduled actions). I'm probably going with an AWS Event bridge (or another scheduler) + AWS Lambda approach that is deployed in the same VPC and has a custom role mapped to the aws-auth configmap to update the nodepools of interest and set the CPU to 0. Then find all EC2 instances in the VPC that match the label from the nodepool and issue a termination. I think we'll still have a minimal EC2 node(s) deployed via an ASG which we turn off with our existing pattern. — Reply to this email directly, view it on GitHub <#1177 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AIEQPKSLZB5RYHS2HOFNIXT3HOWHJAVCNFSM6AAAAABF5ZKOICVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTANBYGU4TQMZXGA> . You are receiving this because you were mentioned.Message ID: ***@***.***>

0 replies

k8s-triage-robot · 2025-10-06T14:45:24Z

k8s-triage-robot
Oct 6, 2025

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

0 replies

felipewnp · 2025-10-06T21:00:10Z

felipewnp
Oct 6, 2025

/remove-lifecycle stale

0 replies

k8s-triage-robot · 2026-01-04T21:12:12Z

k8s-triage-robot
Jan 4, 2026

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

0 replies

Constantin07 · 2026-01-04T21:35:47Z

Constantin07
Jan 4, 2026

/remove-lifecycle stale

0 replies

jmdeal · 2026-02-12T00:30:46Z

jmdeal
Feb 12, 2026
Maintainer

Declarative, schedule-based scaling targets isn't something we currently consider in scope for the project. There have been a number of discussions about this in other issues and in working group meetings, and the consensus has been that the correct approach is to drive scale-down via workloads rather than via the node orchestrator. This previous comment summarizes that approach well:

If you want to try a different approach: Instead of reducing the number of nodes you can downscale all of your workloads and let the karpenter acts on reducing the number of nodes.

Since we don't currently consider this in scope we're going to convert this to a discussion. We believe this is a better format for gathering data to justify a feature as opposed to a linear issue.

0 replies

Ability to Scale Karpenter Provisioned Nodes To 0 On Demand Or By Schedule During Off Hours #2850

Uh oh!

Description

Replies: 34 comments

Uh oh!

jonathan-innis Apr 22, 2024 Maintainer

Uh oh!

ronberna Apr 22, 2024 Author

Uh oh!

Uh oh!

Uh oh!

ronberna May 28, 2024 Author

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ronberna Mar 18, 2025 Author

Uh oh!

Uh oh!

Uh oh!

ronberna May 28, 2025 Author

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jmdeal Feb 12, 2026 Maintainer

jonathan-innis
Apr 22, 2024
Maintainer

ronberna
Apr 22, 2024
Author

ronberna
May 28, 2024
Author

ronberna
Mar 18, 2025
Author

ronberna
May 28, 2025
Author

jmdeal
Feb 12, 2026
Maintainer