|
| 1 | +--- |
| 2 | +layout: blog |
| 3 | +title: "Kubernetes 1.29: Decoupling taint-manager from node-lifecycle-controller" |
| 4 | +date: 2023-12-19 |
| 5 | +slug: kubernetes-1-29-taint-eviction-controller |
| 6 | +--- |
| 7 | + |
| 8 | +**Authors:** Yuan Chen (Apple), Andrea Tosatto (Apple) |
| 9 | + |
| 10 | +This blog discusses a new feature in Kubernetes 1.29 to improve the handling of taint-based pod eviction. |
| 11 | + |
| 12 | +## Background |
| 13 | + |
| 14 | +In Kubernetes 1.29, an improvement has been introduced to enhance the taint-based pod eviction handling on nodes. |
| 15 | +This blog discusses the changes made to node-lifecycle-controller |
| 16 | +to separate its responsibilities and improve overall code maintainability. |
| 17 | + |
| 18 | +## Summary of changes |
| 19 | + |
| 20 | +node-lifecycle-controller previously combined two independent functions: |
| 21 | + |
| 22 | +- Adding a pre-defined set of `NoExecute` taints to Node based on Node's condition. |
| 23 | +- Performing pod eviction on `NoExecute` taint. |
| 24 | + |
| 25 | +With the Kubernetes 1.29 release, the taint-based eviction implementation has been |
| 26 | +moved out of node-lifecycle-controller into a separate and independent component called taint-eviction-controller. |
| 27 | +This separation aims to disentangle code, enhance code maintainability, |
| 28 | +and facilitate future extensions to either component. |
| 29 | + |
| 30 | +As part of the change, additional metrics were introduced to help you monitor taint-based pod evictions: |
| 31 | + |
| 32 | +- `pod_deletion_duration_seconds` measures the latency between the time when a taint effect |
| 33 | +has been activated for the Pod and its deletion via taint-eviction-controller. |
| 34 | +- `pod_deletions_total` reports the total number of Pods deleted by taint-eviction-controller since its start. |
| 35 | + |
| 36 | +## How to use the new feature? |
| 37 | + |
| 38 | +A new feature gate, `SeparateTaintEvictionController`, has been added. The feature is enabled by default as Beta in Kubernetes 1.29. |
| 39 | +Please refer to the [feature gate document](/docs/reference/command-line-tools-reference/feature-gates/). |
| 40 | + |
| 41 | + |
| 42 | +When this feature is enabled, users can optionally disable taint-based eviction by setting `--controllers=-taint-eviction-controller` |
| 43 | +in kube-controller-manager. |
| 44 | + |
| 45 | +To disable the new feature and use the old taint-manager within node-lifecylecycle-controller , users can set the feature gate `SeparateTaintEvictionController=false`. |
| 46 | + |
| 47 | +## Use cases |
| 48 | + |
| 49 | +This new feature will allow cluster administrators to extend and enhance the default |
| 50 | +taint-eviction-controller and even replace the default taint-eviction-controller with a |
| 51 | +custom implementation to meet different needs. An example is to better support |
| 52 | +stateful workloads that use PersistentVolume on local disks. |
| 53 | + |
| 54 | +## FAQ |
| 55 | + |
| 56 | +**Does this feature change the existing behavior of taint-based pod evictions?** |
| 57 | + |
| 58 | +No, the taint-based pod eviction behavior remains unchanged. If the feature gate |
| 59 | +`SeparateTaintEvictionController` is turned off, the legacy node-lifecycle-controller with taint-manager will continue to be used. |
| 60 | + |
| 61 | +**Will enabling/using this feature result in an increase in the time taken by any operations covered by existing SLIs/SLOs?** |
| 62 | + |
| 63 | +No. |
| 64 | + |
| 65 | +**Will enabling/using this feature result in an increase in resource usage (CPU, RAM, disk, IO, ...)?** |
| 66 | + |
| 67 | +The increase in resource usage by running a separate `taint-eviction-controller` will be negligible. |
| 68 | + |
| 69 | +## Learn more |
| 70 | + |
| 71 | +For more details, refer to the [KEP](http://kep.k8s.io/3902). |
| 72 | + |
| 73 | +## Acknowledgments |
| 74 | + |
| 75 | +As with any Kubernetes feature, multiple community members have contributed, from |
| 76 | +writing the KEP to implementing the new controller and reviewing the KEP and code. Special thanks to: |
| 77 | + |
| 78 | +- Aldo Culquicondor (@alculquicondor) |
| 79 | +- Maciej Szulik (@soltysh) |
| 80 | +- Filip Křepinský (@atiratree) |
| 81 | +- Han Kang (@logicalhan) |
| 82 | +- Wei Huang (@Huang-Wei) |
| 83 | +- Sergey Kanzhelevi (@SergeyKanzhelev) |
| 84 | +- Ravi Gudimetla (@ravisantoshgudimetla) |
| 85 | +- Deep Debroy (@ddebroy) |
0 commit comments