You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: keps/sig-apps/961-maxunavailable-for-statefulset/README.md
+27-25Lines changed: 27 additions & 25 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -445,7 +445,9 @@ New proposed implementation: https://github.com/kubernetes/kubernetes/pull/13090
445
445
446
446
#### Metrics
447
447
448
-
We'll add a new metric named `statefulset_unavailability_violation`, it tracks how many violations are detected while processing StatefulSets with maxUnavailable > 1, (counter goes up if processed StatefulSet has spec.replicas - status.readyReplicas > maxUnavailable)
448
+
We'll add two new metrics:
449
+
-**statefulset_max_unavailable**: tracks the current `.spec.updateStrategy.rollingUpdate.maxUnavailable` value. This gauge reflects the configured maximum number of pods that can be unavailable during rolling updates, providing visibility into the availability constraints.
450
+
-**statefulset_unavailable_replicas**: tracks the current number of unavailable pods in a StatefulSet. This gauge reflects the real-time count of pods that are either missing or unavailable (i.e., not ready for `.spec.minReadySeconds`).
449
451
450
452
### Test Plan
451
453
@@ -545,6 +547,7 @@ We expect no non-infra related flakes in the last month as a GA graduation crite
545
547
- test that rolling updates are working correctly for both PodManagementPolicy types when the MaxUnavailable is used.
546
548
- include a test that fails currently but passes when https://github.com/kubernetes/kubernetes/issues/112307 is fixed, with a
547
549
StatefulSet setting `minReadySeconds` and `updateStrategy.rollingUpdate.maxUnavailable` and checking for a correct rollout specially when scaling down during a rollout.
- It is necessary to update the firstUnhealthyPod calculation to correctly call processCondemned. New tests should cover this and take into consideration that the controller should first wait for the predecessor condemned pods to become available before deleting them and delete the pod with the highest ordinal number
623
+
- Added `statefulset_max_unavailable` and `statefulset_unavailable_replicas` metrics to in-tree.
624
+
- It is necessary to update the `firstUnhealthyPod` calculation to correctly call processCondemned. New tests should cover this and take into consideration that the controller should first wait for the predecessor condemned pods to become available before deleting them and delete the pod with the highest ordinal number
622
625
- minReadySeconds and maxUnavailable bugs https://github.com/kubernetes/kubernetes/issues/123911, https://github.com/kubernetes/kubernetes/issues/112307, https://github.com/kubernetes/kubernetes/issues/119234 and https://github.com/kubernetes/kubernetes/issues/123918 should be fixed before promotion of maxUnavailable.
623
626
- Additional unit/e2e/integration tests listed in the test plan should be added covering the newly found bugs.
624
-
- Users should be warned that maxUnavailable works differently for each podManagementPolicy (E.g for OrderedReady it is not applied until the StatefulSet had a chance to fully scale up). This can result in slower rollouts. For parallel this can skip ordering. This should be both mentioned in the API doc and website as a requirements for beta graduation.
627
+
- Users should be warned that maxUnavailable works differently for each podManagementPolicy (e.g. for `OrderedReady` it is not applied until the StatefulSet had a chance to fully scale up). This can result in slower rollouts. For parallel this can skip ordering. This should be both mentioned in the API doc and website as a requirements for beta graduation.
625
628
626
629
#### GA
627
630
@@ -743,7 +746,7 @@ mid-rollout?
743
746
Be sure to consider highly-available clusters, where, for example,
744
747
feature flags will be enabled on some API servers and not others during the
745
748
rollout. Similarly, consider large clusters and how enablement/disablement
746
-
will rollout across nodes.
749
+
will roll out across nodes.
747
750
-->
748
751
749
752
The rollout or rollback of the `maxUnavailable` feature for StatefulSets primarily affects how updates are managed, aiming to minimize disruptions. However, several scenarios could lead to potential issues:
@@ -789,28 +792,28 @@ Multiple violations of maxUnavailable might indicate issues with feature behavio
789
792
A manual test was performed, as follows:
790
793
791
794
1. Create a cluster in 1.33.
792
-
2. Upgrade to 1.34.
795
+
2. Upgrade to 1.35.
793
796
3. Create StatefulSet A with spec.updateStrategy.rollingUpdate.maxUnavailable set to 3, with 6 replicas
794
797
4. Verify a rollout and check if only 3 pods are unavailable at a time ([currently with a bug if podManagementPolicy is set to Parallel](https://github.com/kubernetes/kubernetes/issues/112307))
795
798
5. Downgrade to 1.33.
796
799
6. Verify that the rollout only has 1 pod unavailable at a time, similar to setting maxUnavailable to 1
797
800
7. Create another StatefulSet B not setting maxUnavailable (leaving it nil)
798
-
8. Upgrade to 1.34.
801
+
8. Upgrade to 1.35.
799
802
9. Verify that the rollout has default behavior of only having one pod unavailable at a time
800
803
Verify that the `maxUnavailable` can be set again to StatefulSet A and test the rollout behavior
801
804
802
805
TODO:
803
806
A manual test will be performed, as follows:
804
807
805
808
1. Create a cluster in 1.33.
806
-
2. Upgrade to 1.34.
809
+
2. Upgrade to 1.35.
807
810
3. Create StatefulSet A with spec.updateStrategy.rollingUpdate.maxUnavailable set to 3, with 6 replicas
808
811
4. Verify a rollout and check if only 3 pods are unavailable at a time
809
812
5. Check if rollout is also fine with podManagementPolicy set to Parallel
810
813
6. Downgrade to 1.33.
811
814
7. Verify that the rollout only has 1 pod unavailable at a time, similar to setting maxUnavailable to 1 (MaxUnavailableStatefulSet feature gate disabled by default).
812
815
8. Create another StatefulSet B not setting maxUnavailable (leaving it nil)
813
-
9. Upgrade to 1.34.
816
+
9. Upgrade to 1.35.
814
817
10. Verify that the rollout has default behavior of only having one pod unavailable at a time
815
818
Verify that the `maxUnavailable` can be set again to StatefulSet A and test the rollout behavior
816
819
@@ -822,8 +825,8 @@ No
822
825
823
826
###### How can an operator determine if the feature is in use by workloads?
824
827
825
-
If their StatefulSet rollingUpdate section has the field maxUnavailable specified with
826
-
a value different than 1. While in alpha and beta, the feature-gate needs to be enabled.
828
+
If their StatefulSet rollingUpdate section has the field `maxUnavailabl`e specified with
829
+
a value different from 1. While in alpha and beta, the feature-gate needs to be enabled.
827
830
828
831
The command bellow should show the maxUnavailable value:
- Other field: .spec.updateStrategy.rollingUpdate.maxUnavailable
841
844
-[X] Other (treat as last resort)
842
-
- Details: Users can view the `statefulset_unavailability_violation` metric to see if there have been instances
845
+
- Details: Users can view the `statefulset_unavailable_replicas` or `statefulset_max_unavailable` metrics to see if there have been instances
843
846
where the feature is not working as intended.
844
847
845
848
###### What are the reasonable SLOs (Service Level Objectives) for the enhancement?
@@ -861,7 +864,7 @@ question.
861
864
862
865
Startup latency of schedulable stateful pods should follow the [existing latency SLOs](https://github.com/kubernetes/community/blob/master/sig-scalability/slos/slos.md#steady-state-slisslos).
863
866
864
-
The total number of `statefulset_unavailability_violation` increments across all StatefulSets must not exceed 5 over a 28-day rolling window.
867
+
`statefulset_unavailable_replicas` > `statefulset_max_unavailable`must not exceed the limit.
865
868
866
869
###### What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service?
867
870
@@ -883,13 +886,12 @@ Pick one more of these and delete the rest.
883
886
- Metric name: `workqueue_work_duration_seconds`
884
887
- Scope: Observes the time taken to process StatefulSet operations from the work queue.
885
888
- Components exposing the metric: `kube-controller-manager`
886
-
- Metric name: `workqueue_retries_total`
887
-
888
-
- Scope: Counts the total number of retries for StatefulSet update operations within the work queue. This metric provides insight into the stability and reliability of the StatefulSet update process, indicating potential issues when high.
889
-
- Components Exposing the Metric: `kube-controller-manager`
889
+
- Metric name: `workqueue_retries_total`
890
+
- Scope: Counts the total number of retries for StatefulSet update operations within the work queue. This metric provides insight into the stability and reliability of the StatefulSet update process, indicating potential issues when high.
891
+
- Components Exposing the Metric: `kube-controller-manager`
- Scope: Counts the number of times maxUnavailable has been violated (i.espec.replicas - availableReplicas > maxUnavailable).
894
+
- Scope: Counts the number of times maxUnavailable has been violated (i.e. `.spec.replicas` - availableReplicas > maxUnavailable).
893
895
- Components Exposing the Metric: `kube-controller-manager`
894
896
895
897
###### Are there any missing metrics that would be useful to have to improve observability of this feature?
@@ -938,7 +940,7 @@ No.
938
940
###### How does this feature react if the API server and/or etcd is unavailable?
939
941
940
942
The RollingUpdate will fail or will not be able to proceed if etcd or API server is unavailable and
941
-
hence this feature will also be not be able to be used.
943
+
hence this feature will also not be able to be used.
942
944
943
945
###### What are other known failure modes?
944
946
@@ -957,7 +959,7 @@ For each of them, fill in the following information by copying the below templat
957
959
958
960
- Incorrect Handling of minReadySeconds During StatefulSet Updates with Parallel Pod Management
959
961
- Detection:
960
-
- Monitor the `statefulset_unavailability_violation` metric of the StatefulSet during rolling updates. A large value of this metric could indicate the issue.
962
+
- Monitor the `statefulset_unavailable_replicas` and `statefulset_max_unavailable` metrics of the StatefulSet during rolling updates. A large value of this metric could indicate the issue.
961
963
- Review StatefulSet events or controller logs for rapid succession of pod updates without adherence to minReadySeconds, which could confirm that the delay is not being respected.
962
964
- Mitigations:
963
965
- Temporarily adjust the podManagementPolicy to OrderedReady as a workaround to ensure minReadySeconds is respected during updates, though this may slow down the rollout process.
@@ -975,10 +977,10 @@ For each of them, fill in the following information by copying the below templat
975
977
976
978
- 2019-01-01: KEP created.
977
979
- 2019-08-30: PR Implemented with tests covered.
978
-
-<<[UNRESOLVED bugs found in alpha and blockers to promotion @knelasevero@atiratree@bersalazar@leomichalski]>>
979
-
Open PRs: https://github.com/kubernetes/kubernetes/pull/130909, https://github.com/kubernetes/kubernetes/pull/130951
980
-
<<[/UNRESOLVED]>>
981
-
- 2025-XX-XX: Bump to Beta.
980
+
- bugs found in alpha and blockers to promotion @knelasevero@atiratree@bersalazar@leomichalski
0 commit comments