Skip to content

Commit 0d29502

Browse files
authored
Merge pull request #4029 from xing-yang/non-graceful-node-shutdown-ga
KEP-2268: move non-graceful node shutdown to GA
2 parents 87e3a35 + 94314b9 commit 0d29502

File tree

3 files changed

+74
-24
lines changed

3 files changed

+74
-24
lines changed

keps/prod-readiness/sig-storage/2268.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,4 +2,6 @@ kep-number: 2268
22
alpha:
33
approver: "@deads2k"
44
beta:
5+
approver: "@deads2k"
6+
stable:
57
approver: "@deads2k"

keps/sig-storage/2268-non-graceful-shutdown/README.md

Lines changed: 60 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,4 @@
1-
# Non graceful node shutdown
2-
3-
This includes the Summary and Motivation sections.
1+
# KEP-2268: Non graceful node shutdown
42

53
## Table of Contents
64

@@ -41,20 +39,20 @@ This includes the Summary and Motivation sections.
4139
## Release Signoff Checklist
4240

4341
Items marked with (R) are required *prior to targeting to a milestone / release*.
44-
- [ ] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial KEP PR)
45-
- [ ] (R) KEP approvers have approved the KEP status as `implementable`
46-
- [ ] (R) Design details are appropriately documented
47-
- [ ] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors)
48-
- [ ] e2e Tests for all Beta API Operations (endpoints)
42+
- [X] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial KEP PR)
43+
- [X] (R) KEP approvers have approved the KEP status as `implementable`
44+
- [X] (R) Design details are appropriately documented
45+
- [X] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors)
46+
- [X] e2e Tests for all Beta API Operations (endpoints)
4947
- [ ] (R) Ensure GA e2e tests for meet requirements for [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
5048
- [ ] (R) Minimum Two Week Window for GA e2e tests to prove flake free
51-
- [ ] (R) Graduation criteria is in place
49+
- [X] (R) Graduation criteria is in place
5250
- [ ] (R) [all GA Endpoints](https://github.com/kubernetes/community/pull/1806) must be hit by [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
53-
- [ ] (R) Production readiness review completed
54-
- [ ] (R) Production readiness review approved
55-
- [ ] "Implementation History" section is up-to-date for milestone
56-
- [ ] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io]
57-
- [ ] Supporting documentation - e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes
51+
- [X] (R) Production readiness review completed
52+
- [X] (R) Production readiness review approved
53+
- [X] "Implementation History" section is up-to-date for milestone
54+
- [X] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io]
55+
- [X] Supporting documentation - e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes
5856

5957
**Note:** Any PRs to move a KEP to `implementable` or significant changes once it is marked `implementable` should be approved by each of the KEP approvers. If any of those
6058
approvers is no longer appropriate than changes to that list should be approved by the remaining approvers and/or the owning SIG (or SIG-arch for cross cutting KEPs).
@@ -146,7 +144,7 @@ To mitigate this we plan to have a high test coverage and to introduce this enha
146144

147145
### Test Plan
148146

149-
[x] I/we understand the owners of the involved components may require updates to
147+
[X] I/we understand the owners of the involved components may require updates to
150148
existing tests to make this code solid enough prior to committing the changes necessary
151149
to implement this enhancement.
152150

@@ -386,20 +384,38 @@ logs or events for this purpose.
386384
The usage of this feature requires the manual step of applying a taint
387385
so the operator should be the one applying it.
388386

387+
###### How can someone using this feature know that it is working for their instance?
388+
389+
<!--
390+
For instance, if this is a pod-related feature, it should be possible to determine if the feature is functioning properly
391+
for each individual pod.
392+
Pick one more of these and delete the rest.
393+
Please describe all items visible to end users below with sufficient detail so that they can verify correct enablement
394+
and operation of this feature.
395+
Recall that end users cannot usually observe component logs or access metrics.
396+
-->
397+
398+
- [X] API .status
399+
If it works, pods in the stateful workload should be re-scheduled to another
400+
running node. `Phase` in Pod `Status` should be `Running` for a new Pod
401+
on the other running node.
402+
If not, check the pod status to see why it does not come up.
403+
389404
###### What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service?
390405

391406
<!--
392407
Pick one more of these and delete the rest.
393408
-->
394409
- [X] Metrics
395410
- Metric name:
396-
- We can add new metrics `deleting_pods_total`, `deleting_pods_error_total`
397-
in Pod GC Controller.
398-
For Attach Detach Controller, there's already a metric:
399-
attachdetach_controller_forced_detaches
400-
It is also useful to know how many nodes have taints. We can explore with [kube-state-metrics](https://github.com/kubernetes/kube-state-metrics) which generates metrics about the state of the objects.
411+
- New metrics are added in Pod GC Controller:
412+
- `force_delete_pods_total{reason="out-of-service|terminated|orphaned|unscheduled"}`, the number of pods that are being forcefully deleted since the Pod GC Controller started.
413+
- `force_delete_pod_errors_total{reason="out-of-service|terminated|orphaned|unscheduled"}`, the number of errors encountered when forcefully deleting the pods since the Pod GC Controller started.
414+
- For Attach Detach Controller, the following metric will be recorded if a force detach is performed because the node has the `out-of-service` taint or a timeout happens:
415+
- `attachdetach_controller_forced_detaches{reason="out-of-service|timeout"}`, the number of times the Attach Detach Controller performed a forced detach.
416+
- There is also a `kube_node_spec_taint` in [kube-state-metrics](https://github.com/kubernetes/kube-state-metrics/blob/main/docs/node-metrics.md) that is a metric for the taint of a Kubernetes cluster node.
401417
- [Optional] Aggregation method:
402-
- Components exposing the metric:
418+
- Components exposing the metric: kube-controller-manager
403419
- [X] Other (treat as last resort)
404420
- Details:
405421
- Check whether the workload moved to a different running node
@@ -490,6 +506,13 @@ For GA, this section is required: approvers should be able to confirm the
490506
previous answers based on experience in the field.
491507
-->
492508

509+
Without this feature, a user can forcefully delete the pods after they are
510+
in terminating state and new pods will be re-scheduled to another running
511+
node after 6 minutes. With this feature, new pods will be re-scheduled to
512+
another running node without the 6 minute wait after the user has applied
513+
the `out-of-service` taint. It speeds up the failover but should not
514+
affect the scalability.
515+
493516
###### Will enabling / using this feature result in any new API calls?
494517

495518
<!--
@@ -560,6 +583,19 @@ This through this both in small and large cases, again with respect to the
560583
-->
561584
No.
562585

586+
###### Can enabling / using this feature result in resource exhaustion of some node resources (PIDs, sockets, inodes, etc.)?
587+
588+
<!--
589+
Focus not just on happy cases, but primarily on more pathological cases
590+
(e.g. probes taking a minute instead of milliseconds, failed pods consuming resources, etc.).
591+
If any of the resources can be exhausted, how this is mitigated with the existing limits
592+
(e.g. pods per node) or new limits added by this KEP?
593+
594+
Are there any tests that were run/should be run to understand performance characteristics better
595+
and validate the declared limits?
596+
-->
597+
No.
598+
563599
### Troubleshooting
564600

565601
<!--
@@ -648,6 +684,9 @@ For each of them, fill in the following information by copying the below templat
648684
- 2020-11-10: KEP updated to handle part of the node partitioning
649685
- 2021-08-26: The scope of the KEP is narrowed down to handle a real node shutdown. Test plan is updated. Node partitioning will be handled in the future and it can be built on top of this design.
650686
- 2021-12-03: Removed `SafeDetach` flag. Requires a user to add the `out-of-service` taint when he/she knows the node is shutdown.
687+
- Kubernetes v1.24: moved to alpha.
688+
- Kubernete v1.26: moved to beta.
689+
- Kubernete v1.28: moved to stable.
651690

652691
## Alternatives
653692

keps/sig-storage/2268-non-graceful-shutdown/kep.yaml

Lines changed: 12 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -20,19 +20,28 @@ see-also:
2020
replaces:
2121

2222
# The target maturity stage in the current dev cycle for this KEP.
23-
stage: beta
23+
stage: stable
2424

2525
# The most recent milestone for which work toward delivery of this KEP has been
2626
# done. This can be the current (upcoming) milestone, if it is being actively
2727
# worked on.
28-
latest-milestone: "v1.26"
28+
latest-milestone: "v1.28"
2929

3030
# The milestone at which this feature was, or is targeted to be, at each stage.
3131
milestone:
3232
alpha: "v1.24"
3333
beta: "v1.26"
34-
stable: "v1.27"
34+
stable: "v1.28"
3535

3636
# The following PRR answers are required at alpha release
3737
# List the feature gate name and the components for which it must be enabled
38+
feature-gates:
39+
- name: NodeOutOfServiceVolumeDetach
40+
components:
41+
- kube-controller-manager
3842
disable-supported: true
43+
44+
# The following PRR answers are required at beta release
45+
metrics:
46+
- force_delete_pods_total{reason="out-of-service|terminated|orphaned|unscheduled"}
47+
- force_delete_pod_errors_total{reason="out-of-service|terminated|orphaned|unscheduled"}

0 commit comments

Comments
 (0)