Skip to content

Update maxUnavailable calculation for leader StatefulSet#781

Merged
k8s-ci-robot merged 2 commits intokubernetes-sigs:mainfrom
adinilfeld:maxunavailable-0
Mar 24, 2026
Merged

Update maxUnavailable calculation for leader StatefulSet#781
k8s-ci-robot merged 2 commits intokubernetes-sigs:mainfrom
adinilfeld:maxunavailable-0

Conversation

@adinilfeld
Copy link
Contributor

@adinilfeld adinilfeld commented Mar 16, 2026

What type of PR is this?

/kind feature

What this PR does / why we need it

This PR sets the maxUnavailable value of the leader StatefulSet to lws.maxSurge + lws.maxUnavailable, rather than just lws.maxUnavailable. This has two main benefits:

  1. Users can now achieve zero-downtime updates by specifying maxUnavailable=0, as long as maxSurge>0. Previously, the leader StatefulSet would have rejected this value. (See Support maxUnavailable=0 #776.)

  2. Rolling updates now update pods in larger batches, leading to rollouts completing quicker while still respecting the spec.

    For instance, the example given in the Rollout Strategy doc previously didn't work as documented. In Stage3, R-2 and R-3 would only begin updating after both surge pods were Ready, as the StatefulSet's maxUnavailable=2 was "used up" by the surge pods. Now it would have maxUnavailable=4 and be able to update as intended. In my testing, this change reduced the rollout time by 30% (from 1m48s to 1m15s).

In addition, I made some small tweaks to the logs for clarity.

Which issue(s) this PR fixes

Fixes #776.

Special notes for your reviewer

In addition to unit/e2e/integration tests, I verified the following scenarios manually:

maxUnavailable=0, maxSurge=2: once the two surge pods are Ready, the existing pods are updated two at a time.

Full events
3s          Normal   CreatingRevision                 leaderworkerset/lws-maxunavailable-0   Creating revision with key 6fdd78dd8d for updated LWS
3s          Normal   GroupsUpdating                   leaderworkerset/lws-maxunavailable-0   Rolling Upgrade is in progress, with 4 groups ready of total 4 groups
2s          Normal   GroupsUpdating                   leaderworkerset/lws-maxunavailable-0   Rolling Upgrade is in progress, with 4 groups ready of total 4 groups
3s          Normal   SuccessfulCreate                 statefulset/lws-maxunavailable-0       Create Pod lws-maxunavailable-0-5 in StatefulSet lws-maxunavailable-0 successful
3s          Normal   SuccessfulCreate                 statefulset/lws-maxunavailable-0       Create Pod lws-maxunavailable-0-4 in StatefulSet lws-maxunavailable-0 successful
2s          Normal   GroupsProgressing                leaderworkerset/lws-maxunavailable-0   Created worker statefulset for leader pod lws-maxunavailable-0-4
2s          Normal   GroupsProgressing                leaderworkerset/lws-maxunavailable-0   Created worker statefulset for leader pod lws-maxunavailable-0-5
0s          Normal   GroupsUpdating                   leaderworkerset/lws-maxunavailable-0   Updating replicas 2 to 3 (inclusive)
0s          Normal   SuccessfulDelete                 statefulset/lws-maxunavailable-0       Delete Pod lws-maxunavailable-0-3 in StatefulSet lws-maxunavailable-0 successful
0s          Normal   SuccessfulDelete                 statefulset/lws-maxunavailable-0       Delete Pod lws-maxunavailable-0-2 in StatefulSet lws-maxunavailable-0 successful
0s          Normal   SuccessfulCreate                 statefulset/lws-maxunavailable-0       Create Pod lws-maxunavailable-0-3 in StatefulSet lws-maxunavailable-0 successful
0s          Normal   GroupsProgressing                leaderworkerset/lws-maxunavailable-0   Created worker statefulset for leader pod lws-maxunavailable-0-3
0s          Normal   SuccessfulCreate                 statefulset/lws-maxunavailable-0       Create Pod lws-maxunavailable-0-2 in StatefulSet lws-maxunavailable-0 successful
0s          Normal   GroupsProgressing                leaderworkerset/lws-maxunavailable-0   Created worker statefulset for leader pod lws-maxunavailable-0-2
0s          Normal   GroupsUpdating                   leaderworkerset/lws-maxunavailable-0   Updating replica 1
0s          Normal   SuccessfulDelete                 statefulset/lws-maxunavailable-0       Delete Pod lws-maxunavailable-0-1 in StatefulSet lws-maxunavailable-0 successful
0s          Normal   GroupsUpdating                   leaderworkerset/lws-maxunavailable-0   Updating replica 0
0s          Normal   SuccessfulDelete                 statefulset/lws-maxunavailable-0       Delete Pod lws-maxunavailable-0-0 in StatefulSet lws-maxunavailable-0 successful
0s          Normal   SuccessfulCreate                 statefulset/lws-maxunavailable-0       Create Pod lws-maxunavailable-0-1 in StatefulSet lws-maxunavailable-0 successful
0s          Normal   GroupsProgressing                leaderworkerset/lws-maxunavailable-0   Created worker statefulset for leader pod lws-maxunavailable-0-1
0s          Normal   SuccessfulCreate                 statefulset/lws-maxunavailable-0       Create Pod lws-maxunavailable-0-0 in StatefulSet lws-maxunavailable-0 successful
0s          Normal   GroupsProgressing                leaderworkerset/lws-maxunavailable-0   Created worker statefulset for leader pod lws-maxunavailable-0-0
0s          Normal   GroupsProgressing                leaderworkerset/lws-maxunavailable-0   deleting surge replica lws-maxunavailable-0-5
0s          Normal   SuccessfulDelete                 statefulset/lws-maxunavailable-0       Delete Pod lws-maxunavailable-0-5 in StatefulSet lws-maxunavailable-0 successful
0s          Normal   GroupsProgressing                leaderworkerset/lws-maxunavailable-0   deleting surge replica lws-maxunavailable-0-4
0s          Normal   AllGroupsReady                   leaderworkerset/lws-maxunavailable-0   All replicas are ready, with 6 groups ready of total 4 groups
0s          Normal   SuccessfulDelete                 statefulset/lws-maxunavailable-0       Delete Pod lws-maxunavailable-0-4 in StatefulSet lws-maxunavailable-0 successful

maxUnavailable=2, maxSurge=2: the rollout order now matches the example given in the docs.

Full events
2m57s       Normal   CreatingRevision                 leaderworkerset/lws-rollout-doc-example   Creating revision with key 6fdd78dd8d for updated LWS
2m57s       Normal   GroupsUpdating                   leaderworkerset/lws-rollout-doc-example   Rolling Upgrade is in progress, with 4 groups ready of total 4 groups
2m57s       Normal   GroupsUpdating                   leaderworkerset/lws-rollout-doc-example   Updating replicas 2 to 3 (inclusive)
2m58s       Normal   SuccessfulCreate                 statefulset/lws-rollout-doc-example       Create Pod lws-rollout-doc-example-4 in StatefulSet lws-rollout-doc-example successful
2m58s       Normal   SuccessfulCreate                 statefulset/lws-rollout-doc-example       Create Pod lws-rollout-doc-example-5 in StatefulSet lws-rollout-doc-example successful
2m58s       Normal   SuccessfulDelete                 statefulset/lws-rollout-doc-example       Delete Pod lws-rollout-doc-example-3 in StatefulSet lws-rollout-doc-example successful
2m57s       Normal   GroupsProgressing                leaderworkerset/lws-rollout-doc-example   Created worker statefulset for leader pod lws-rollout-doc-example-4
2m58s       Normal   SuccessfulDelete                 statefulset/lws-rollout-doc-example       Delete Pod lws-rollout-doc-example-2 in StatefulSet lws-rollout-doc-example successful
2m57s       Normal   GroupsUpdating                   leaderworkerset/lws-rollout-doc-example   Rolling Upgrade is in progress, with 4 groups ready of total 4 groups
2m57s       Normal   GroupsProgressing                leaderworkerset/lws-rollout-doc-example   Created worker statefulset for leader pod lws-rollout-doc-example-5
2m54s       Normal   GroupsUpdating                   leaderworkerset/lws-rollout-doc-example   Updating replica 1
2m55s       Normal   SuccessfulDelete                 statefulset/lws-rollout-doc-example       Delete Pod lws-rollout-doc-example-1 in StatefulSet lws-rollout-doc-example successful
2m55s       Normal   SuccessfulDelete                 statefulset/lws-rollout-doc-example       Delete Pod lws-rollout-doc-example-0 in StatefulSet lws-rollout-doc-example successful
2m26s       Normal   GroupsProgressing                leaderworkerset/lws-rollout-doc-example   Created worker statefulset for leader pod lws-rollout-doc-example-2
2m25s       Normal   GroupsProgressing                leaderworkerset/lws-rollout-doc-example   Created worker statefulset for leader pod lws-rollout-doc-example-3
2m23s       Normal   GroupsProgressing                leaderworkerset/lws-rollout-doc-example   Created worker statefulset for leader pod lws-rollout-doc-example-0
2m22s       Normal   GroupsProgressing                leaderworkerset/lws-rollout-doc-example   Created worker statefulset for leader pod lws-rollout-doc-example-1
104s        Normal   GroupsProgressing                leaderworkerset/lws-rollout-doc-example   deleting surge replicas from lws-rollout-doc-example-4 to lws-rollout-doc-example-5
105s        Normal   SuccessfulDelete                 statefulset/lws-rollout-doc-example       Delete Pod lws-rollout-doc-example-5 in StatefulSet lws-rollout-doc-example successful
105s        Normal   SuccessfulDelete                 statefulset/lws-rollout-doc-example       Delete Pod lws-rollout-doc-example-4 in StatefulSet lws-rollout-doc-example successful
101s        Normal   AllGroupsReady                   leaderworkerset/lws-rollout-doc-example   All replicas are ready, with 6 groups ready of total 4 groups

Does this PR introduce a user-facing change?

NONE

@k8s-ci-robot k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Mar 16, 2026
@k8s-ci-robot k8s-ci-robot requested a review from kerthcet March 16, 2026 21:02
@netlify
Copy link

netlify bot commented Mar 16, 2026

Deploy Preview for kubernetes-sigs-lws canceled.

Name Link
🔨 Latest commit 0aa7d94
🔍 Latest deploy log https://app.netlify.com/projects/kubernetes-sigs-lws/deploys/69c315aa19fd730008fc2d93

@k8s-ci-robot k8s-ci-robot requested a review from yankay March 16, 2026 21:02
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Mar 16, 2026
@adinilfeld adinilfeld marked this pull request as draft March 16, 2026 21:03
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 16, 2026
@adinilfeld adinilfeld marked this pull request as ready for review March 16, 2026 21:29
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 16, 2026
@k8s-ci-robot k8s-ci-robot requested a review from Edwinhr716 March 16, 2026 21:29
@adinilfeld adinilfeld marked this pull request as draft March 16, 2026 21:39
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 16, 2026
@adinilfeld
Copy link
Contributor Author

/test pull-lws-test-integration-main

@adinilfeld adinilfeld marked this pull request as ready for review March 16, 2026 22:00
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 16, 2026
@k8s-ci-robot k8s-ci-robot requested a review from ahg-g March 16, 2026 22:00
@adinilfeld
Copy link
Contributor Author

/test pull-lws-test-integration-main

@Edwinhr716
Copy link
Contributor

/lgtm

Will approve once #778 is merged

@k8s-ci-robot k8s-ci-robot added lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Mar 21, 2026
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Mar 21, 2026
@adinilfeld adinilfeld force-pushed the maxunavailable-0 branch 2 times, most recently from e48ee1f to 593e598 Compare March 21, 2026 00:23
@adinilfeld
Copy link
Contributor Author

/test pull-lws-test-integration-main

@k8s-ci-robot k8s-ci-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Mar 23, 2026
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Mar 23, 2026
Fix maxSurge calculation to be capped at lws.replicas, and evaluate percentages against sts.replicas rather than lws.replicas. Also add unit tests to prevent regressions.
@Edwinhr716
Copy link
Contributor

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 24, 2026
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: adinilfeld, Edwinhr716

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 24, 2026
@k8s-ci-robot k8s-ci-robot merged commit d4d6b17 into kubernetes-sigs:main Mar 24, 2026
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support maxUnavailable=0

3 participants