|
| 1 | +--- |
| 2 | +title: 排查 CNI 插件相关的错误 |
| 3 | +content_type: task |
| 4 | +weight: 10 |
| 5 | +--- |
| 6 | +<!-- |
| 7 | +title: Troubleshooting CNI plugin-related errors |
| 8 | +content_type: task |
| 9 | +reviewers: |
| 10 | +- mikebrow |
| 11 | +- divya-mohan0209 |
| 12 | +weight: 10 |
| 13 | +--> |
| 14 | + |
| 15 | +<!-- overview --> |
| 16 | + |
| 17 | +<!-- |
| 18 | +To avoid CNI plugin-related errors, verify that you are using or upgrading to a |
| 19 | +container runtime that has been tested to work correctly with your version of |
| 20 | +Kubernetes. |
| 21 | +--> |
| 22 | + |
| 23 | +为了避免 CNI 插件相关的错误,需要验证你正在使用或升级到一个经过测试的容器运行时, |
| 24 | +该容器运行时能够在你的 Kubernetes 版本上正常工作。 |
| 25 | + |
| 26 | +<!-- |
| 27 | +For example, the following container runtimes are being prepared, or have already been prepared, for Kubernetes v1.24: |
| 28 | +
|
| 29 | +* containerd v1.6.4 and later, v1.5.11 and later |
| 30 | +* The CRI-O v1.24.0 and later |
| 31 | +--> |
| 32 | + |
| 33 | +例如,针对 Kubernetes v1.24 而言,以下容器运行时正在准备或已经就绪: |
| 34 | + |
| 35 | +* containerd v1.6.4 及更新版本、v1.5.11 及更新版本 |
| 36 | +* CRI-O v1.24.0 及更新版本 |
| 37 | + |
| 38 | +<!-- |
| 39 | +## About the "Incompatible CNI versions" and "Failed to destroy network for sandbox" errors |
| 40 | +--> |
| 41 | + |
| 42 | +## 关于 "Incompatible CNI versions" 和 "Failed to destroy network for sandbox" 错误 {#about-the-incompatible-cni-versions-and-failed-to-destroy-network-for-sandbox-errors} |
| 43 | + |
| 44 | +<!-- |
| 45 | +Service issues exist for pod CNI network setup and tear down in containerd |
| 46 | +v1.6.0-v1.6.3 when the CNI plugins have not been upgraded and/or the CNI config |
| 47 | +version is not declared in the CNI config files. The containerd team reports, "these issues are resolved in containerd v1.6.4." |
| 48 | +With containerd v1.6.0-v1.6.3, if you do not upgrade the CNI plugins and/or |
| 49 | +declare the CNI config version, you might encounter the following "Incompatible |
| 50 | +CNI versions" or "Failed to destroy network for sandbox" error conditions. |
| 51 | +--> |
| 52 | + |
| 53 | +在 containerd v1.6.0-v1.6.3 中,当配置或清除 Pod CNI 网络时,如果 CNI 插件没有升级和/或 |
| 54 | +CNI 配置文件中没有声明 CNI 配置版本时,会出现服务问题。containerd 团队报告说: |
| 55 | +“这些问题在 containerd v1.6.4 中得到了解决。” |
| 56 | + |
| 57 | +在使用 containerd v1.6.0-v1.6.3 时,如果你不升级 CNI 插件和/或声明 CNI 配置版本, |
| 58 | +你可能会遇到以下 "Incompatible CNI versions" 或 "Failed to destroy network for sandbox" |
| 59 | +错误状况。 |
| 60 | + |
| 61 | +<!-- |
| 62 | +### Incompatible CNI versions error |
| 63 | +--> |
| 64 | + |
| 65 | +### Incompatible CNI versions 错误 {#incompatible-cni-versions-error} |
| 66 | + |
| 67 | +<!-- |
| 68 | +If the version of your CNI plugin does not correctly match the plugin version in |
| 69 | +the config because the config version is later than the plugin version, the |
| 70 | +containerd log will likely show an error message on startup of a pod similar |
| 71 | +to: |
| 72 | +--> |
| 73 | + |
| 74 | +如果因为配置版本比插件版本新,导致你的 CNI 插件版本与配置中的插件版本无法正确匹配时, |
| 75 | +在启动 Pod 时,containerd 日志可能会显示类似的错误信息: |
| 76 | + |
| 77 | +``` |
| 78 | +incompatible CNI versions; config is \"1.0.0\", plugin supports [\"0.1.0\" \"0.2.0\" \"0.3.0\" \"0.3.1\" \"0.4.0\"]" |
| 79 | +``` |
| 80 | + |
| 81 | +<!-- |
| 82 | +To fix this issue, [update your CNI plugins and CNI config files](#updating-your-cni-plugins-and-cni-config-files). |
| 83 | +--> |
| 84 | + |
| 85 | +为了解决这个问题,需要[更新你的 CNI 插件和 CNI 配置文件](#updating-your-cni-plugins-and-cni-config-files)。 |
| 86 | + |
| 87 | +<!-- |
| 88 | +### Failed to destroy network for sandbox error |
| 89 | +--> |
| 90 | + |
| 91 | +### Failed to destroy network for sandbox 错误 {#failed-to-destroy-network-for-sandbox-error} |
| 92 | + |
| 93 | +<!-- |
| 94 | +If the version of the plugin is missing in the CNI plugin config, the pod may |
| 95 | +run. However, stopping the pod generates an error similar to: |
| 96 | +--> |
| 97 | + |
| 98 | +如果 CNI 插件配置中未给出插件的版本, |
| 99 | +Pod 可能可以运行。但是,停止 Pod 时会产生类似于以下错误: |
| 100 | + |
| 101 | +``` |
| 102 | +ERRO[2022-04-26T00:43:24.518165483Z] StopPodSandbox for "b" failed |
| 103 | +error="failed to destroy network for sandbox \"bbc85f891eaf060c5a879e27bba9b6b06450210161dfdecfbb2732959fb6500a\": invalid version \"\": the version is empty" |
| 104 | +``` |
| 105 | + |
| 106 | +<!-- |
| 107 | +This error leaves the pod in the not-ready state with a network namespace still |
| 108 | +attached. To recover from this problem, [edit the CNI config file](#updating-your-cni-plugins-and-cni-config-files) to add |
| 109 | +the missing version information. The next attempt to stop the pod should |
| 110 | +be successful. |
| 111 | +--> |
| 112 | + |
| 113 | +此错误使 Pod 处于未就绪状态,且仍然挂接到某网络名字空间上。 |
| 114 | +为修复这一问题,[编辑 CNI 配置文件](#updating-your-cni-plugins-and-cni-config-files)以添加缺失的版本信息。 |
| 115 | +下一次尝试停止 Pod 应该会成功。 |
| 116 | + |
| 117 | +<!-- |
| 118 | +### Updating your CNI plugins and CNI config files |
| 119 | +--> |
| 120 | + |
| 121 | +### 更新你的 CNI 插件和 CNI 配置文件 {#updating-your-cni-plugins-and-cni-config-files} |
| 122 | + |
| 123 | +<!-- |
| 124 | +If you're using containerd v1.6.0-v1.6.3 and encountered "Incompatible CNI |
| 125 | +versions" or "Failed to destroy network for sandbox" errors, consider updating |
| 126 | +your CNI plugins and editing the CNI config files. |
| 127 | +
|
| 128 | +Here's an overview of the typical steps for each node: |
| 129 | +--> |
| 130 | + |
| 131 | +如果你使用 containerd v1.6.0-v1.6.3 并遇到 "Incompatible CNI versions" 或者 |
| 132 | +"Failed to destroy network for sandbox" 错误,考虑更新你的 CNI 插件并编辑 CNI 配置文件。 |
| 133 | + |
| 134 | +以下是针对各节点要执行的典型步骤的概述: |
| 135 | + |
| 136 | +<!-- |
| 137 | +1. [Safely drain and cordon the |
| 138 | +node](/docs/tasks/administer-cluster/safely-drain-node/). |
| 139 | +--> |
| 140 | + |
| 141 | +1. [安全地腾空并隔离节点](/zh-cn/docs/tasks/administer-cluster/safely-drain-node/)。 |
| 142 | + |
| 143 | +<!-- |
| 144 | +2. After stopping your container runtime and kubelet services, perform the |
| 145 | +following upgrade operations: |
| 146 | + - If you're running CNI plugins, upgrade them to the latest version. |
| 147 | + - If you're using non-CNI plugins, replace them with CNI plugins. Use the |
| 148 | + latest version of the plugins. |
| 149 | + - Update the plugin configuration file to specify or match a version of the |
| 150 | + CNI specification that the plugin supports, as shown in the following ["An |
| 151 | + example containerd configuration |
| 152 | + file"](#an-example-containerd-configuration-file) section. |
| 153 | + - For `containerd`, ensure that you have installed the latest version (v1.0.0 |
| 154 | + or later) of the CNI loopback plugin. |
| 155 | + - Upgrade node components (for example, the kubelet) to Kubernetes v1.24 |
| 156 | + - Upgrade to or install the most current version of the container runtime. |
| 157 | +--> |
| 158 | + |
| 159 | +2. 停止容器运行时和 kubelet 服务后,执行以下升级操作: |
| 160 | + - 如果你正在运行 CNI 插件,请将它们升级到最新版本。 |
| 161 | + - 如果你使用的是非 CNI 插件,请将它们替换为 CNI 插件,并使用最新版本的插件。 |
| 162 | + - 更新插件配置文件以指定或匹配 CNI 规范支持的插件版本, |
| 163 | + 如后文["containerd 配置文件示例"](#an-example-containerd-configuration-file)章节所示。 |
| 164 | + - 对于 `containerd`,请确保你已安装 CNI loopback 插件的最新版本(v1.0.0 或更高版本)。 |
| 165 | + - 将节点组件(例如 kubelet)升级到 Kubernetes v1.24 |
| 166 | + - 升级到或安装最新版本的容器运行时。 |
| 167 | + |
| 168 | +<!-- |
| 169 | +3. Bring the node back into your cluster by restarting your container runtime |
| 170 | +and kubelet. Uncordon the node (`kubectl uncordon <nodename>`). |
| 171 | +--> |
| 172 | + |
| 173 | +3. 通过重新启动容器运行时和 kubelet 将节点重新加入到集群。取消节点隔离(`kubectl uncordon <nodename>`)。 |
| 174 | + |
| 175 | +<!-- |
| 176 | +## An example containerd configuration file |
| 177 | +--> |
| 178 | + |
| 179 | +## containerd 配置文件示例 {#an-example-containerd-configuration-file} |
| 180 | + |
| 181 | +<!-- |
| 182 | +The following example shows a configuration for `containerd` runtime v1.6.x, |
| 183 | +which supports a recent version of the CNI specification (v1.0.0). |
| 184 | +Please see the documentation from your plugin and networking provider for |
| 185 | +further instructions on configuring your system. |
| 186 | +--> |
| 187 | + |
| 188 | +以下示例显示了 `containerd` 运行时 v1.6.x 的配置, |
| 189 | +它支持最新版本的 CNI 规范(v1.0.0)。 |
| 190 | +请参阅你的插件和网络提供商的文档,以获取有关你系统配置的进一步说明。 |
| 191 | + |
| 192 | +<!-- |
| 193 | +On Kubernetes, containerd runtime adds a loopback interface, `lo`, to pods as a |
| 194 | +default behavior. The containerd runtime configures the loopback interface via a |
| 195 | +CNI plugin, `loopback`. The `loopback` plugin is distributed as part of the |
| 196 | +`containerd` release packages that have the `cni` designation. `containerd` |
| 197 | +v1.6.0 and later includes a CNI v1.0.0-compatible loopback plugin as well as |
| 198 | +other default CNI plugins. The configuration for the loopback plugin is done |
| 199 | +internally by containerd, and is set to use CNI v1.0.0. This also means that the |
| 200 | +version of the `loopback` plugin must be v1.0.0 or later when this newer version |
| 201 | +`containerd` is started. |
| 202 | +--> |
| 203 | + |
| 204 | +在 Kubernetes 中,作为其默认行为,containerd 运行时为 Pod 添加一个本地回路接口,`lo`。 |
| 205 | +containerd 运行时通过 CNI 插件 `loopback` 配置本地回路接口。 |
| 206 | +`loopback` 插件作为 `containerd` 发布包的一部分,扮演 `cni` 角色。 |
| 207 | +`containerd` v1.6.0 及更高版本包括与 CNI v1.0.0 兼容的 loopback 插件以及其他默认 CNI 插件。 |
| 208 | +loopback 插件的配置由 containerd 内部完成, 并被设置为使用 CNI v1.0.0。 |
| 209 | +这也意味着当这个更新版本的 `containerd` 启动时,`loopback` 插件的版本必然是 v1.0.0 或更高版本。 |
| 210 | + |
| 211 | +<!-- |
| 212 | +The following bash command generates an example CNI config. Here, the 1.0.0 |
| 213 | +value for the config version is assigned to the `cniVersion` field for use when |
| 214 | +`containerd` invokes the CNI bridge plugin. |
| 215 | +--> |
| 216 | + |
| 217 | +以下 Bash 命令生成一个 CNI 配置示例。这里,`cniVersion` 字段被设置为配置版本值 1.0.0, |
| 218 | +以供 `containerd` 调用 CNI 桥接插件时使用。 |
| 219 | + |
| 220 | +```bash |
| 221 | +cat << EOF | tee /etc/cni/net.d/10-containerd-net.conflist |
| 222 | +{ |
| 223 | + "cniVersion": "1.0.0", |
| 224 | + "name": "containerd-net", |
| 225 | + "plugins": [ |
| 226 | + { |
| 227 | + "type": "bridge", |
| 228 | + "bridge": "cni0", |
| 229 | + "isGateway": true, |
| 230 | + "ipMasq": true, |
| 231 | + "promiscMode": true, |
| 232 | + "ipam": { |
| 233 | + "type": "host-local", |
| 234 | + "ranges": [ |
| 235 | + [{ |
| 236 | + "subnet": "10.88.0.0/16" |
| 237 | + }], |
| 238 | + [{ |
| 239 | + "subnet": "2001:db8:4860::/64" |
| 240 | + }] |
| 241 | + ], |
| 242 | + "routes": [ |
| 243 | + { "dst": "0.0.0.0/0" }, |
| 244 | + { "dst": "::/0" } |
| 245 | + ] |
| 246 | + } |
| 247 | + }, |
| 248 | + { |
| 249 | + "type": "portmap", |
| 250 | + "capabilities": {"portMappings": true} |
| 251 | + } |
| 252 | + ] |
| 253 | +} |
| 254 | +EOF |
| 255 | +``` |
| 256 | + |
| 257 | +<!-- |
| 258 | +Update the IP address ranges in the preceding example with ones that are based |
| 259 | +on your use case and network addressing plan. |
| 260 | +--> |
| 261 | + |
| 262 | +基于你的用例和网络地址规划,将前面示例中的 IP 地址范围更新为合适的值。 |
| 263 | + |
0 commit comments