Skip to content

Commit 751ec13

Browse files
authored
Merge pull request #43676 from yuanchen8911/taintmanager
Add a blog post about decoupled taint eviction controller
2 parents 8d6a481 + fe882f1 commit 751ec13

File tree

1 file changed

+85
-0
lines changed

1 file changed

+85
-0
lines changed
Lines changed: 85 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,85 @@
1+
---
2+
layout: blog
3+
title: "Kubernetes 1.29: Decoupling taint-manager from node-lifecycle-controller"
4+
date: 2023-12-19
5+
slug: kubernetes-1-29-taint-eviction-controller
6+
---
7+
8+
**Authors:** Yuan Chen (Apple), Andrea Tosatto (Apple)
9+
10+
This blog discusses a new feature in Kubernetes 1.29 to improve the handling of taint-based pod eviction.
11+
12+
## Background
13+
14+
In Kubernetes 1.29, an improvement has been introduced to enhance the taint-based pod eviction handling on nodes.
15+
This blog discusses the changes made to node-lifecycle-controller
16+
to separate its responsibilities and improve overall code maintainability.
17+
18+
## Summary of changes
19+
20+
node-lifecycle-controller previously combined two independent functions:
21+
22+
- Adding a pre-defined set of `NoExecute` taints to Node based on Node's condition.
23+
- Performing pod eviction on `NoExecute` taint.
24+
25+
With the Kubernetes 1.29 release, the taint-based eviction implementation has been
26+
moved out of node-lifecycle-controller into a separate and independent component called taint-eviction-controller.
27+
This separation aims to disentangle code, enhance code maintainability,
28+
and facilitate future extensions to either component.
29+
30+
As part of the change, additional metrics were introduced to help you monitor taint-based pod evictions:
31+
32+
- `pod_deletion_duration_seconds` measures the latency between the time when a taint effect
33+
has been activated for the Pod and its deletion via taint-eviction-controller.
34+
- `pod_deletions_total` reports the total number of Pods deleted by taint-eviction-controller since its start.
35+
36+
## How to use the new feature?
37+
38+
A new feature gate, `SeparateTaintEvictionController`, has been added. The feature is enabled by default as Beta in Kubernetes 1.29.
39+
Please refer to the [feature gate document](/docs/reference/command-line-tools-reference/feature-gates/).
40+
41+
42+
When this feature is enabled, users can optionally disable taint-based eviction by setting `--controllers=-taint-eviction-controller`
43+
in kube-controller-manager.
44+
45+
To disable the new feature and use the old taint-manager within node-lifecylecycle-controller , users can set the feature gate `SeparateTaintEvictionController=false`.
46+
47+
## Use cases
48+
49+
This new feature will allow cluster administrators to extend and enhance the default
50+
taint-eviction-controller and even replace the default taint-eviction-controller with a
51+
custom implementation to meet different needs. An example is to better support
52+
stateful workloads that use PersistentVolume on local disks.
53+
54+
## FAQ
55+
56+
**Does this feature change the existing behavior of taint-based pod evictions?**
57+
58+
No, the taint-based pod eviction behavior remains unchanged. If the feature gate
59+
`SeparateTaintEvictionController` is turned off, the legacy node-lifecycle-controller with taint-manager will continue to be used.
60+
61+
**Will enabling/using this feature result in an increase in the time taken by any operations covered by existing SLIs/SLOs?**
62+
63+
No.
64+
65+
**Will enabling/using this feature result in an increase in resource usage (CPU, RAM, disk, IO, ...)?**
66+
67+
The increase in resource usage by running a separate `taint-eviction-controller` will be negligible.
68+
69+
## Learn more
70+
71+
For more details, refer to the [KEP](http://kep.k8s.io/3902).
72+
73+
## Acknowledgments
74+
75+
As with any Kubernetes feature, multiple community members have contributed, from
76+
writing the KEP to implementing the new controller and reviewing the KEP and code. Special thanks to:
77+
78+
- Aldo Culquicondor (@alculquicondor)
79+
- Maciej Szulik (@soltysh)
80+
- Filip Křepinský (@atiratree)
81+
- Han Kang (@logicalhan)
82+
- Wei Huang (@Huang-Wei)
83+
- Sergey Kanzhelevi (@SergeyKanzhelev)
84+
- Ravi Gudimetla (@ravisantoshgudimetla)
85+
- Deep Debroy (@ddebroy)

0 commit comments

Comments
 (0)