Skip to content

Commit 07fd705

Browse files
committed
[zh] Add 2025-05-02-mutable-csi-node-allocatable.md
1 parent 08b0cd7 commit 07fd705

File tree

1 file changed

+156
-0
lines changed

1 file changed

+156
-0
lines changed
Lines changed: 156 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,156 @@
1+
---
2+
layout: blog
3+
title: "Kubernetes v1.33:可变的 CSI 节点可分配数"
4+
date: 2025-05-02T10:30:00-08:00
5+
slug: kubernetes-1-33-mutable-csi-node-allocatable-count
6+
author: Eddie Torres (Amazon Web Services)
7+
translator: Michael Yao (DaoCloud)
8+
---
9+
<!--
10+
layout: blog
11+
title: "Kubernetes v1.33: Mutable CSI Node Allocatable Count"
12+
date: 2025-05-02T10:30:00-08:00
13+
slug: kubernetes-1-33-mutable-csi-node-allocatable-count
14+
author: Eddie Torres (Amazon Web Services)
15+
-->
16+
17+
<!--
18+
Scheduling stateful applications reliably depends heavily on accurate information about resource availability on nodes.
19+
Kubernetes v1.33 introduces an alpha feature called *mutable CSI node allocatable count*, allowing Container Storage Interface (CSI) drivers to dynamically update the reported maximum number of volumes that a node can handle.
20+
This capability significantly enhances the accuracy of pod scheduling decisions and reduces scheduling failures caused by outdated volume capacity information.
21+
-->
22+
可靠调度有状态应用极度依赖于节点上资源可用性的准确信息。
23+
Kubernetes v1.33 引入一个名为**可变的 CSI 节点可分配计数**的 Alpha 特性,允许
24+
CSI(容器存储接口)驱动动态更新节点可以处理的最大卷数量。
25+
这一能力显著提升 Pod 调度决策的准确性,并减少因卷容量信息过时而导致的调度失败。
26+
27+
<!--
28+
## Background
29+
30+
Traditionally, Kubernetes CSI drivers report a static maximum volume attachment limit when initializing. However, actual attachment capacities can change during a node's lifecycle for various reasons, such as:
31+
32+
- Manual or external operations attaching/detaching volumes outside of Kubernetes control.
33+
- Dynamically attached network interfaces or specialized hardware (GPUs, NICs, etc.) consuming available slots.
34+
- Multi-driver scenarios, where one CSI driver’s operations affect available capacity reported by another.
35+
-->
36+
37+
## 背景 {#background}
38+
39+
传统上,Kubernetes 中的 CSI 驱动在初始化时会报告一个静态的最大卷挂接限制。
40+
然而,在节点生命周期内,实际的挂接容量可能会由于多种原因发生变化,例如:
41+
42+
- 在 Kubernetes 控制之外的手动或外部操作挂接/解除挂接卷。
43+
- 动态挂接的网络接口或专用硬件(如 GPU、NIC 等)占用可用的插槽。
44+
- 在多驱动场景中,一个 CSI 驱动的操作会影响另一个驱动所报告的可用容量。
45+
46+
<!--
47+
Static reporting can cause Kubernetes to schedule pods onto nodes that appear to have capacity but don't, leading to pods stuck in a `ContainerCreating` state.
48+
49+
## Dynamically adapting CSI volume limits
50+
51+
With the new feature gate `MutableCSINodeAllocatableCount`, Kubernetes enables CSI drivers to dynamically adjust and report node attachment capacities at runtime. This ensures that the scheduler has the most accurate, up-to-date view of node capacity.
52+
-->
53+
静态报告可能导致 Kubernetes 将 Pod 调度到看似有容量但实际没有的节点上,进而造成
54+
Pod 长时间卡在 `ContainerCreating` 状态。
55+
56+
## 动态适应 CSI 卷限制 {#dynamically-adapting-csi-volume-limits}
57+
58+
借助新的特性门控 `MutableCSINodeAllocatableCount`,Kubernetes 允许 CSI
59+
驱动在运行时动态调整并报告节点的挂接容量。如此确保调度器能获取到最准确、最新的节点容量信息。
60+
61+
<!--
62+
### How it works
63+
64+
When this feature is enabled, Kubernetes supports two mechanisms for updating the reported node volume limits:
65+
66+
- **Periodic Updates:** CSI drivers specify an interval to periodically refresh the node's allocatable capacity.
67+
- **Reactive Updates:** An immediate update triggered when a volume attachment fails due to exhausted resources (`ResourceExhausted` error).
68+
-->
69+
### 工作原理 {#how-it-works}
70+
71+
启用此特性后,Kubernetes 支持通过以下两种机制来更新节点卷限制的报告值:
72+
73+
- **周期性更新:** CSI 驱动指定一个间隔时间,来定期刷新节点的可分配容量。
74+
- **响应式更新:** 当因资源耗尽(`ResourceExhausted` 错误)导致卷挂接失败时,立即触发更新。
75+
76+
<!--
77+
### Enabling the feature
78+
79+
To use this alpha feature, you must enable the `MutableCSINodeAllocatableCount` feature gate in these components:
80+
-->
81+
### 启用此特性 {#enabling-the-feature}
82+
83+
要使用此 Alpha 特性,你必须在以下组件中启用 `MutableCSINodeAllocatableCount` 特性门控:
84+
85+
- `kube-apiserver`
86+
- `kubelet`
87+
88+
<!--
89+
### Example CSI driver configuration
90+
91+
Below is an example of configuring a CSI driver to enable periodic updates every 60 seconds:
92+
-->
93+
### CSI 驱动配置示例 {#example-csi-driver-configuration}
94+
95+
以下是配置 CSI 驱动以每 60 秒进行一次周期性更新的示例:
96+
97+
```yaml
98+
apiVersion: storage.k8s.io/v1
99+
kind: CSIDriver
100+
metadata:
101+
name: example.csi.k8s.io
102+
spec:
103+
nodeAllocatableUpdatePeriodSeconds: 60
104+
```
105+
106+
<!--
107+
This configuration directs Kubelet to periodically call the CSI driver's `NodeGetInfo` method every 60 seconds, updating the node’s allocatable volume count. Kubernetes enforces a minimum update interval of 10 seconds to balance accuracy and resource usage.
108+
-->
109+
此配置会指示 Kubelet 每 60 秒调用一次 CSI 驱动的 `NodeGetInfo` 方法,从而更新节点的可分配卷数量。
110+
Kubernetes 强制要求最小更新间隔时间为 10 秒,以平衡准确性和资源使用量。
111+
112+
<!--
113+
### Immediate updates on attachment failures
114+
115+
In addition to periodic updates, Kubernetes now reacts to attachment failures. Specifically, if a volume attachment fails with a `ResourceExhausted` error (gRPC code `8`), an immediate update is triggered to correct the allocatable count promptly.
116+
117+
This proactive correction prevents repeated scheduling errors and helps maintain cluster health.
118+
-->
119+
### 挂接失败时的即时更新 {#immediate-updates-on-attachment-failures}
120+
121+
除了周期性更新外,Kubernetes 现在也能对挂接失败做出响应。
122+
具体来说,如果卷挂接由于 `ResourceExhausted` 错误(gRPC 错误码 `8`)而失败,将立即触发更新,以快速纠正可分配数量。
123+
124+
这种主动纠正可以防止重复的调度错误,有助于保持集群的健康状态。
125+
126+
<!--
127+
## Getting started
128+
129+
To experiment with mutable CSI node allocatable count in your Kubernetes v1.33 cluster:
130+
131+
1. Enable the feature gate `MutableCSINodeAllocatableCount` on the `kube-apiserver` and `kubelet` components.
132+
2. Update your CSI driver configuration by setting `nodeAllocatableUpdatePeriodSeconds`.
133+
3. Monitor and observe improvements in scheduling accuracy and pod placement reliability.
134+
-->
135+
## 快速开始 {#getting-started}
136+
137+
要在 Kubernetes v1.33 集群中试用可变的 CSI 节点可分配数:
138+
139+
1. 在 `kube-apiserver` 和 `kubelet` 组件上启用特性门控 `MutableCSINodeAllocatableCount`。
140+
2. 在 CSI 驱动配置中设置 `nodeAllocatableUpdatePeriodSeconds`。
141+
3. 监控并观察调度准确性和 Pod 放置可靠性的提升程度。
142+
143+
<!--
144+
## Next steps
145+
146+
This feature is currently in alpha and the Kubernetes community welcomes your feedback. Test it, share your experiences, and help guide its evolution toward beta and GA stability.
147+
148+
Join discussions in the [Kubernetes Storage Special Interest Group (SIG-Storage)](https://github.com/kubernetes/community/tree/master/sig-storage) to shape the future of Kubernetes storage capabilities.
149+
-->
150+
## 后续计划 {#next-steps}
151+
152+
此特性目前处于 Alpha 阶段,Kubernetes 社区欢迎你的反馈。
153+
无论是参与测试、分享你的经验,都有助于推动此特性向 Beta 和 GA(正式发布)稳定版迈进。
154+
155+
欢迎加入 [Kubernetes SIG-Storage](https://github.com/kubernetes/community/tree/master/sig-storage)
156+
的讨论,共同塑造 Kubernetes 存储能力的未来。

0 commit comments

Comments
 (0)