Skip to content

Commit b50c772

Browse files
authored
Merge pull request #34316 from ydFu/troubleshooting-cni-plugin-related-errors
[zh] Add troubleshooting-cni-plugin-related-errors
2 parents 7c6247d + f182a6e commit b50c772

File tree

1 file changed

+263
-0
lines changed

1 file changed

+263
-0
lines changed
Lines changed: 263 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,263 @@
1+
---
2+
title: 排查 CNI 插件相关的错误
3+
content_type: task
4+
weight: 10
5+
---
6+
<!--
7+
title: Troubleshooting CNI plugin-related errors
8+
content_type: task
9+
reviewers:
10+
- mikebrow
11+
- divya-mohan0209
12+
weight: 10
13+
-->
14+
15+
<!-- overview -->
16+
17+
<!--
18+
To avoid CNI plugin-related errors, verify that you are using or upgrading to a
19+
container runtime that has been tested to work correctly with your version of
20+
Kubernetes.
21+
-->
22+
23+
为了避免 CNI 插件相关的错误,需要验证你正在使用或升级到一个经过测试的容器运行时,
24+
该容器运行时能够在你的 Kubernetes 版本上正常工作。
25+
26+
<!--
27+
For example, the following container runtimes are being prepared, or have already been prepared, for Kubernetes v1.24:
28+
29+
* containerd v1.6.4 and later, v1.5.11 and later
30+
* The CRI-O v1.24.0 and later
31+
-->
32+
33+
例如,针对 Kubernetes v1.24 而言,以下容器运行时正在准备或已经就绪:
34+
35+
* containerd v1.6.4 及更新版本、v1.5.11 及更新版本
36+
* CRI-O v1.24.0 及更新版本
37+
38+
<!--
39+
## About the "Incompatible CNI versions" and "Failed to destroy network for sandbox" errors
40+
-->
41+
42+
## 关于 "Incompatible CNI versions" 和 "Failed to destroy network for sandbox" 错误 {#about-the-incompatible-cni-versions-and-failed-to-destroy-network-for-sandbox-errors}
43+
44+
<!--
45+
Service issues exist for pod CNI network setup and tear down in containerd
46+
v1.6.0-v1.6.3 when the CNI plugins have not been upgraded and/or the CNI config
47+
version is not declared in the CNI config files. The containerd team reports, "these issues are resolved in containerd v1.6.4."
48+
With containerd v1.6.0-v1.6.3, if you do not upgrade the CNI plugins and/or
49+
declare the CNI config version, you might encounter the following "Incompatible
50+
CNI versions" or "Failed to destroy network for sandbox" error conditions.
51+
-->
52+
53+
在 containerd v1.6.0-v1.6.3 中,当配置或清除 Pod CNI 网络时,如果 CNI 插件没有升级和/或
54+
CNI 配置文件中没有声明 CNI 配置版本时,会出现服务问题。containerd 团队报告说:
55+
“这些问题在 containerd v1.6.4 中得到了解决。”
56+
57+
在使用 containerd v1.6.0-v1.6.3 时,如果你不升级 CNI 插件和/或声明 CNI 配置版本,
58+
你可能会遇到以下 "Incompatible CNI versions" 或 "Failed to destroy network for sandbox"
59+
错误状况。
60+
61+
<!--
62+
### Incompatible CNI versions error
63+
-->
64+
65+
### Incompatible CNI versions 错误 {#incompatible-cni-versions-error}
66+
67+
<!--
68+
If the version of your CNI plugin does not correctly match the plugin version in
69+
the config because the config version is later than the plugin version, the
70+
containerd log will likely show an error message on startup of a pod similar
71+
to:
72+
-->
73+
74+
如果因为配置版本比插件版本新,导致你的 CNI 插件版本与配置中的插件版本无法正确匹配时,
75+
在启动 Pod 时,containerd 日志可能会显示类似的错误信息:
76+
77+
```
78+
incompatible CNI versions; config is \"1.0.0\", plugin supports [\"0.1.0\" \"0.2.0\" \"0.3.0\" \"0.3.1\" \"0.4.0\"]"
79+
```
80+
81+
<!--
82+
To fix this issue, [update your CNI plugins and CNI config files](#updating-your-cni-plugins-and-cni-config-files).
83+
-->
84+
85+
为了解决这个问题,需要[更新你的 CNI 插件和 CNI 配置文件](#updating-your-cni-plugins-and-cni-config-files)
86+
87+
<!--
88+
### Failed to destroy network for sandbox error
89+
-->
90+
91+
### Failed to destroy network for sandbox 错误 {#failed-to-destroy-network-for-sandbox-error}
92+
93+
<!--
94+
If the version of the plugin is missing in the CNI plugin config, the pod may
95+
run. However, stopping the pod generates an error similar to:
96+
-->
97+
98+
如果 CNI 插件配置中未给出插件的版本,
99+
Pod 可能可以运行。但是,停止 Pod 时会产生类似于以下错误:
100+
101+
```
102+
ERRO[2022-04-26T00:43:24.518165483Z] StopPodSandbox for "b" failed
103+
error="failed to destroy network for sandbox \"bbc85f891eaf060c5a879e27bba9b6b06450210161dfdecfbb2732959fb6500a\": invalid version \"\": the version is empty"
104+
```
105+
106+
<!--
107+
This error leaves the pod in the not-ready state with a network namespace still
108+
attached. To recover from this problem, [edit the CNI config file](#updating-your-cni-plugins-and-cni-config-files) to add
109+
the missing version information. The next attempt to stop the pod should
110+
be successful.
111+
-->
112+
113+
此错误使 Pod 处于未就绪状态,且仍然挂接到某网络名字空间上。
114+
为修复这一问题,[编辑 CNI 配置文件](#updating-your-cni-plugins-and-cni-config-files)以添加缺失的版本信息。
115+
下一次尝试停止 Pod 应该会成功。
116+
117+
<!--
118+
### Updating your CNI plugins and CNI config files
119+
-->
120+
121+
### 更新你的 CNI 插件和 CNI 配置文件 {#updating-your-cni-plugins-and-cni-config-files}
122+
123+
<!--
124+
If you're using containerd v1.6.0-v1.6.3 and encountered "Incompatible CNI
125+
versions" or "Failed to destroy network for sandbox" errors, consider updating
126+
your CNI plugins and editing the CNI config files.
127+
128+
Here's an overview of the typical steps for each node:
129+
-->
130+
131+
如果你使用 containerd v1.6.0-v1.6.3 并遇到 "Incompatible CNI versions" 或者
132+
"Failed to destroy network for sandbox" 错误,考虑更新你的 CNI 插件并编辑 CNI 配置文件。
133+
134+
以下是针对各节点要执行的典型步骤的概述:
135+
136+
<!--
137+
1. [Safely drain and cordon the
138+
node](/docs/tasks/administer-cluster/safely-drain-node/).
139+
-->
140+
141+
1. [安全地腾空并隔离节点](/zh-cn/docs/tasks/administer-cluster/safely-drain-node/)
142+
143+
<!--
144+
2. After stopping your container runtime and kubelet services, perform the
145+
following upgrade operations:
146+
- If you're running CNI plugins, upgrade them to the latest version.
147+
- If you're using non-CNI plugins, replace them with CNI plugins. Use the
148+
latest version of the plugins.
149+
- Update the plugin configuration file to specify or match a version of the
150+
CNI specification that the plugin supports, as shown in the following ["An
151+
example containerd configuration
152+
file"](#an-example-containerd-configuration-file) section.
153+
- For `containerd`, ensure that you have installed the latest version (v1.0.0
154+
or later) of the CNI loopback plugin.
155+
- Upgrade node components (for example, the kubelet) to Kubernetes v1.24
156+
- Upgrade to or install the most current version of the container runtime.
157+
-->
158+
159+
2. 停止容器运行时和 kubelet 服务后,执行以下升级操作:
160+
- 如果你正在运行 CNI 插件,请将它们升级到最新版本。
161+
- 如果你使用的是非 CNI 插件,请将它们替换为 CNI 插件,并使用最新版本的插件。
162+
- 更新插件配置文件以指定或匹配 CNI 规范支持的插件版本,
163+
如后文["containerd 配置文件示例"](#an-example-containerd-configuration-file)章节所示。
164+
- 对于 `containerd`,请确保你已安装 CNI loopback 插件的最新版本(v1.0.0 或更高版本)。
165+
- 将节点组件(例如 kubelet)升级到 Kubernetes v1.24
166+
- 升级到或安装最新版本的容器运行时。
167+
168+
<!--
169+
3. Bring the node back into your cluster by restarting your container runtime
170+
and kubelet. Uncordon the node (`kubectl uncordon <nodename>`).
171+
-->
172+
173+
3. 通过重新启动容器运行时和 kubelet 将节点重新加入到集群。取消节点隔离(`kubectl uncordon <nodename>`)。
174+
175+
<!--
176+
## An example containerd configuration file
177+
-->
178+
179+
## containerd 配置文件示例 {#an-example-containerd-configuration-file}
180+
181+
<!--
182+
The following example shows a configuration for `containerd` runtime v1.6.x,
183+
which supports a recent version of the CNI specification (v1.0.0).
184+
Please see the documentation from your plugin and networking provider for
185+
further instructions on configuring your system.
186+
-->
187+
188+
以下示例显示了 `containerd` 运行时 v1.6.x 的配置,
189+
它支持最新版本的 CNI 规范(v1.0.0)。
190+
请参阅你的插件和网络提供商的文档,以获取有关你系统配置的进一步说明。
191+
192+
<!--
193+
On Kubernetes, containerd runtime adds a loopback interface, `lo`, to pods as a
194+
default behavior. The containerd runtime configures the loopback interface via a
195+
CNI plugin, `loopback`. The `loopback` plugin is distributed as part of the
196+
`containerd` release packages that have the `cni` designation. `containerd`
197+
v1.6.0 and later includes a CNI v1.0.0-compatible loopback plugin as well as
198+
other default CNI plugins. The configuration for the loopback plugin is done
199+
internally by containerd, and is set to use CNI v1.0.0. This also means that the
200+
version of the `loopback` plugin must be v1.0.0 or later when this newer version
201+
`containerd` is started.
202+
-->
203+
204+
在 Kubernetes 中,作为其默认行为,containerd 运行时为 Pod 添加一个本地回路接口,`lo`
205+
containerd 运行时通过 CNI 插件 `loopback` 配置本地回路接口。
206+
`loopback` 插件作为 `containerd` 发布包的一部分,扮演 `cni` 角色。
207+
`containerd` v1.6.0 及更高版本包括与 CNI v1.0.0 兼容的 loopback 插件以及其他默认 CNI 插件。
208+
loopback 插件的配置由 containerd 内部完成, 并被设置为使用 CNI v1.0.0。
209+
这也意味着当这个更新版本的 `containerd` 启动时,`loopback` 插件的版本必然是 v1.0.0 或更高版本。
210+
211+
<!--
212+
The following bash command generates an example CNI config. Here, the 1.0.0
213+
value for the config version is assigned to the `cniVersion` field for use when
214+
`containerd` invokes the CNI bridge plugin.
215+
-->
216+
217+
以下 Bash 命令生成一个 CNI 配置示例。这里,`cniVersion` 字段被设置为配置版本值 1.0.0,
218+
以供 `containerd` 调用 CNI 桥接插件时使用。
219+
220+
```bash
221+
cat << EOF | tee /etc/cni/net.d/10-containerd-net.conflist
222+
{
223+
"cniVersion": "1.0.0",
224+
"name": "containerd-net",
225+
"plugins": [
226+
{
227+
"type": "bridge",
228+
"bridge": "cni0",
229+
"isGateway": true,
230+
"ipMasq": true,
231+
"promiscMode": true,
232+
"ipam": {
233+
"type": "host-local",
234+
"ranges": [
235+
[{
236+
"subnet": "10.88.0.0/16"
237+
}],
238+
[{
239+
"subnet": "2001:db8:4860::/64"
240+
}]
241+
],
242+
"routes": [
243+
{ "dst": "0.0.0.0/0" },
244+
{ "dst": "::/0" }
245+
]
246+
}
247+
},
248+
{
249+
"type": "portmap",
250+
"capabilities": {"portMappings": true}
251+
}
252+
]
253+
}
254+
EOF
255+
```
256+
257+
<!--
258+
Update the IP address ranges in the preceding example with ones that are based
259+
on your use case and network addressing plan.
260+
-->
261+
262+
基于你的用例和网络地址规划,将前面示例中的 IP 地址范围更新为合适的值。
263+

0 commit comments

Comments
 (0)