Skip to content

feat: add HPA-based pod autoscaling to okd deployments#8798

Open
skettkepalli wants to merge 1 commit intoEclipseFdn:mainfrom
skettkepalli:feat/implement_auto_scaling
Open

feat: add HPA-based pod autoscaling to okd deployments#8798
skettkepalli wants to merge 1 commit intoEclipseFdn:mainfrom
skettkepalli:feat/implement_auto_scaling

Conversation

@skettkepalli
Copy link
Copy Markdown

Summary of Changes:

  1. Adds a HorizontalPodAutoscaler Helm template using autoscaling/v2, enabling CPU and memory-based pod autoscaling for the openvsx deployment
  2. Introduces a platform value to the values files so HPA resources are named unambiguously (e.g. open-vsx-org-staging-okd-hpa) and future EKS autoscaling can coexist without naming conflicts
  3. Conditionally omits replicas from the Deployment spec when autoscaling is enabled, preventing Helm upgrades from overriding the replica count managed by the HPA
  4. Autoscaling is enabled on staging (1–3 replicas) as a low-risk pilot, and disabled on production and test

Autoscaling behaviour

Scale-up: Reacts immediately doubling fleet size every 15s to absorb traffic spikes quickly.
Scale-down: Conservative — uses a scaleDownWindow stabilisation window and removes at most 10% of pods per minute, preventing flapping on brief traffic dips.
Triggers: CPU ≥ 70% or memory ≥ 80% average utilization across pods.

@netomi
Copy link
Copy Markdown
Contributor

netomi commented Mar 12, 2026

can we enable scaling on number of requests or open connections?

what we have seen is that cpu is under utilized while connections are rejected, as open connections could not be completed as they are being blocked on IO.

@netomi
Copy link
Copy Markdown
Contributor

netomi commented Mar 12, 2026

memory is not a good indication to scale, the pods are usually at max memory.

@manpreetkaur-arch
Copy link
Copy Markdown
Contributor

My 2 cents: CPU under utilized and threads starving for connections are a clear indication that we don't need to scale up and we should fine tune thread pool and connection pool instead. Spinning up more replicas will do more harm than good in this case.

@skettkepalli
Copy link
Copy Markdown
Author

can we enable scaling on number of requests or open connections?

what we have seen is that cpu is under utilized while connections are rejected, as open connections could not be completed as they are being blocked on IO.

Scaling based on thread count or connection metrics is technically possible, but it requires using KEDA (Kubernetes Event-Driven Autoscaling) since the standard Horizontal Pod Autoscaler (HPA) only supports scaling based on CPU and memory metrics by default.

At the moment there are a couple of open questions:

Whether the Custom Metrics Autoscaler / KEDA operator is installed in our OKD cluster.

Whether KEDA can query our existing remote Prometheus endpoint to retrieve the required metrics.

Additionally, KEDA uses its own resource type (ScaledObject) to define autoscaling behavior. This means it cannot reuse a standalone HPA directly; instead the scaling configuration would need to be defined through a KEDA ScaledObject, which internally manages the HPA.

@netomi
Copy link
Copy Markdown
Contributor

netomi commented Mar 12, 2026

indeed, as I was mentioning we need to have a better understanding what values for connections are good for our setup. However we will need to have some way of scaling on top of that as well, maybe more conservative.

@netomi
Copy link
Copy Markdown
Contributor

netomi commented Mar 12, 2026

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants