set k8s registry QPS to match MAX_PODS by msporleder-work · Pull Request #1801 · awslabs/amazon-eks-ami

msporleder-work · 2024-05-13T14:49:47Z

Issue #, if available:

Description of changes:

When nodes failover in EKS, regardless of their size, the default RegistryPullQPS of 5 highly limits their ability to startup cleanly when running a cluster with more than a few pods.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Testing Done

node failovers/drains on my clusters constantly fail. Bumping the QPS with user-data solves it but this, I think, is a better default.

See this guide for recommended testing for PRs. Some tests may not apply. Completing tests and providing additional validation steps are not required, but it is recommended and may reduce review time and time to merge.

cartermckinnon · 2024-05-13T17:06:42Z

templates/al2/runtime/bootstrap.sh

 if [[ "$USE_MAX_PODS" = "true" ]]; then
  echo "$(jq ".maxPods=$MAX_PODS" $KUBELET_CONFIG)" > $KUBELET_CONFIG
+  #set registryPullQPS to match MAX_PODS to prevent startup problems when nodes failover
+  echo "$(jq --arg MAX_PODS $MAX_PODS '.+= {"registryPullQPS":$MAX_PODS}' $KUBELET_CONFIG)" > $KUBELET_CONFIG


I think it makes sense to increase this beyond 5; but I don't think we want to go all the way to MAX_PODS -- that's a very large value on many instance types. Have you tested a more moderate increase, something like 10+15burst (up from 5+10burst)?

I actually just set it to 0 (disabled). containerd times out commonly for me starting up 50-ish pods at once so I've also started to bump runtimeRequestTimeout.

I definitely think this number should be dynamic and I was suggesting max_pods as a proxy since, if a pod is "full" and it fails, the next one to boot up will potentially get all of those pods assigned.

github-actions · 2025-11-21T16:08:50Z

This pull request is stale because it has been open for 60 days with no activity. Remove the stale label or comment to avoid closure in 14 days

set registry QPS ot match MAX_PODS

f52c907

cartermckinnon reviewed May 13, 2024

View reviewed changes

github-actions bot added the Stale label Nov 21, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

set k8s registry QPS to match MAX_PODS#1801

set k8s registry QPS to match MAX_PODS#1801
msporleder-work wants to merge 1 commit intoawslabs:mainfrom
msporleder-work:main

msporleder-work commented May 13, 2024

Uh oh!

cartermckinnon May 13, 2024

Uh oh!

msporleder-work May 13, 2024

Uh oh!

github-actions bot commented Nov 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

msporleder-work commented May 13, 2024

Uh oh!

cartermckinnon May 13, 2024

Choose a reason for hiding this comment

Uh oh!

msporleder-work May 13, 2024

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Nov 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants