Skip to content

Commit 431e91b

Browse files
ihuibinSea-ntengqm
authored
Create windows.md (#34006)
* Create windows.md Signed-off-by: bin.hui <[email protected]> * Update content/zh/docs/tasks/debug/debug-cluster/windows.md Co-authored-by: Sean <[email protected]> * Update content/zh/docs/tasks/debug/debug-cluster/windows.md Co-authored-by: Sean <[email protected]> * Update content/zh/docs/tasks/debug/debug-cluster/windows.md Co-authored-by: Sean <[email protected]> * Update content/zh/docs/tasks/debug/debug-cluster/windows.md Co-authored-by: Qiming Teng <[email protected]> * Update content/zh/docs/tasks/debug/debug-cluster/windows.md Co-authored-by: Qiming Teng <[email protected]> * Update content/zh/docs/tasks/debug/debug-cluster/windows.md Co-authored-by: Qiming Teng <[email protected]> * Update content/zh/docs/tasks/debug/debug-cluster/windows.md Co-authored-by: Qiming Teng <[email protected]> * Update content/zh/docs/tasks/debug/debug-cluster/windows.md Co-authored-by: Qiming Teng <[email protected]> * Update content/zh/docs/tasks/debug/debug-cluster/windows.md Co-authored-by: Qiming Teng <[email protected]> * Update content/zh/docs/tasks/debug/debug-cluster/windows.md Co-authored-by: Qiming Teng <[email protected]> * Update content/zh/docs/tasks/debug/debug-cluster/windows.md Co-authored-by: Qiming Teng <[email protected]> * Update content/zh/docs/tasks/debug/debug-cluster/windows.md Co-authored-by: Qiming Teng <[email protected]> * Update content/zh/docs/tasks/debug/debug-cluster/windows.md Co-authored-by: Qiming Teng <[email protected]> * Update content/zh/docs/tasks/debug/debug-cluster/windows.md Co-authored-by: Qiming Teng <[email protected]> * Update content/zh/docs/tasks/debug/debug-cluster/windows.md Co-authored-by: Qiming Teng <[email protected]> * Update content/zh/docs/tasks/debug/debug-cluster/windows.md Co-authored-by: Qiming Teng <[email protected]> * Update content/zh/docs/tasks/debug/debug-cluster/windows.md Co-authored-by: Qiming Teng <[email protected]> * Update content/zh/docs/tasks/debug/debug-cluster/windows.md Co-authored-by: Qiming Teng <[email protected]> * Update content/zh/docs/tasks/debug/debug-cluster/windows.md Co-authored-by: Qiming Teng <[email protected]> * Update content/zh/docs/tasks/debug/debug-cluster/windows.md Co-authored-by: Qiming Teng <[email protected]> * Update content/zh/docs/tasks/debug/debug-cluster/windows.md Co-authored-by: Qiming Teng <[email protected]> * Update content/zh/docs/tasks/debug/debug-cluster/windows.md Co-authored-by: Qiming Teng <[email protected]> * Update content/zh/docs/tasks/debug/debug-cluster/windows.md Co-authored-by: Qiming Teng <[email protected]> * Update content/zh/docs/tasks/debug/debug-cluster/windows.md Co-authored-by: Qiming Teng <[email protected]> * Update content/zh/docs/tasks/debug/debug-cluster/windows.md Co-authored-by: Qiming Teng <[email protected]> * Update content/zh/docs/tasks/debug/debug-cluster/windows.md Co-authored-by: Qiming Teng <[email protected]> * Update content/zh/docs/tasks/debug/debug-cluster/windows.md Co-authored-by: Qiming Teng <[email protected]> * Update content/zh/docs/tasks/debug/debug-cluster/windows.md Co-authored-by: Qiming Teng <[email protected]> * Update content/zh/docs/tasks/debug/debug-cluster/windows.md Co-authored-by: Qiming Teng <[email protected]> * Update content/zh/docs/tasks/debug/debug-cluster/windows.md Co-authored-by: Qiming Teng <[email protected]> * Update content/zh/docs/tasks/debug/debug-cluster/windows.md Co-authored-by: Qiming Teng <[email protected]> * Update content/zh/docs/tasks/debug/debug-cluster/windows.md Co-authored-by: Qiming Teng <[email protected]> * Update content/zh/docs/tasks/debug/debug-cluster/windows.md Co-authored-by: Qiming Teng <[email protected]> * Update windows.md Signed-off-by: bin.hui <[email protected]> Co-authored-by: Sean <[email protected]> Co-authored-by: Qiming Teng <[email protected]>
1 parent 94426e8 commit 431e91b

File tree

1 file changed

+314
-0
lines changed
  • content/zh/docs/tasks/debug/debug-cluster

1 file changed

+314
-0
lines changed
Lines changed: 314 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,314 @@
1+
---
2+
title: Windows 调试小技巧
3+
content_type: concept
4+
---
5+
<!--
6+
title: Windows debugging tips
7+
content_type: concept
8+
-->
9+
<!-- overview -->
10+
11+
<!-- body -->
12+
<!--
13+
## Node-level troubleshooting {#troubleshooting-node}
14+
15+
1. My Pods are stuck at "Container Creating" or restarting over and over
16+
17+
Ensure that your pause image is compatible with your Windows OS version.
18+
See [Pause container](/docs/setup/production-environment/windows/intro-windows-in-kubernetes#pause-container)
19+
to see the latest / recommended pause image and/or get more information.
20+
21+
{{< note >}}
22+
If using containerd as your container runtime the pause image is specified in the
23+
`plugins.plugins.cri.sandbox_image` field of the of config.toml configration file.
24+
{{< /note >}}
25+
-->
26+
## 工作节点级别排障 {#troubleshooting-node}
27+
28+
1. 我的 Pod 都卡在 “Container Creating” 或者不断重启
29+
30+
确保你的 pause 镜像跟你的 Windows 版本兼容。
31+
查看 [Pause 容器](zh/docs/setup/production-environment/windows/intro-windows-in-kubernetes#pause-container)
32+
以了解最新的或建议的 pause 镜像,或者了解更多信息。
33+
34+
{{< note >}}
35+
如果你使用了 containerd 作为你的容器运行时,pause 镜像在 config.toml 配置文件的
36+
`plugins.plugins.cri.sandbox_image` 中指定。
37+
{{< /note >}}
38+
<!--
39+
2. My pods show status as `ErrImgPull` or `ImagePullBackOff`
40+
41+
Ensure that your Pod is getting scheduled to a [compatable](https://docs.microsoft.com/virtualization/windowscontainers/deploy-containers/version-compatibility) Windows Node.
42+
43+
More information on how to specify a compatable node for your Pod can be found in [this guide](docs/setup/production-environment/windows/user-guide-windows-containers/#ensuring-os-specific-workloads-land-on-the-appropriate-container-host).
44+
-->
45+
2. 我的 pod 状态显示 'ErrImgPull' 或者 ‘ImagePullBackOff’
46+
47+
保证你的 Pod 被调度到[兼容的](https://docs.microsoft.com/virtualization/windowscontainers/deploy-containers/version-compatibility) Windows 节点上。
48+
49+
关于如何为你的 Pod 指定一个兼容节点,
50+
的更多信息可以查看这个指可以查看[这个指南](/zhdocs/setup/production-environment/windows/user-guide-windows-containers/#ensuring-os-specific-workloads-land-on-the-appropriate-container-host)以了解更多的信息。
51+
<!--
52+
## Network troubleshooting {#troubleshooting-network}
53+
54+
1. My Windows Pods do not have network connectivity
55+
56+
If you are using virtual machines, ensure that MAC spoofing is **enabled** on all
57+
the VM network adapter(s).
58+
-->
59+
## 网络排障 {#troubleshooting-network}
60+
61+
1. 我的 Windows Pod 没有网络连接
62+
63+
如果你使用的是虚拟机,请确保所有 VM 网卡上都已启用 MAC spoofing。
64+
<!--
65+
2. My Windows Pods cannot ping external resources
66+
67+
Windows Pods do not have outbound rules programmed for the ICMP protocol. However,
68+
TCP/UDP is supported. When trying to demonstrate connectivity to resources
69+
outside of the cluster, substitute `ping <IP>` with corresponding
70+
`curl <IP>` commands.
71+
72+
If you are still facing problems, most likely your network configuration in
73+
[cni.conf](https://github.com/Microsoft/SDN/blob/master/Kubernetes/flannel/l2bridge/cni/config/cni.conf)
74+
deserves some extra attention. You can always edit this static file. The
75+
configuration update will apply to any new Kubernetes resources.
76+
77+
One of the Kubernetes networking requirements
78+
(see [Kubernetes model](/docs/concepts/cluster-administration/networking/)) is
79+
for cluster communication to occur without
80+
NAT internally. To honor this requirement, there is an
81+
[ExceptionList](https://github.com/Microsoft/SDN/blob/master/Kubernetes/flannel/l2bridge/cni/config/cni.conf#L20)
82+
for all the communication where you do not want outbound NAT to occur. However,
83+
this also means that you need to exclude the external IP you are trying to query
84+
from the `ExceptionList`. Only then will the traffic originating from your Windows
85+
pods be SNAT'ed correctly to receive a response from the outside world. In this
86+
regard, your `ExceptionList` in `cni.conf` should look as follows:
87+
88+
```conf
89+
"ExceptionList": [
90+
"10.244.0.0/16", # Cluster subnet
91+
"10.96.0.0/12", # Service subnet
92+
"10.127.130.0/24" # Management (host) subnet
93+
]
94+
```
95+
-->
96+
2. 我的 Windows Pod 不能 ping 通外界资源
97+
98+
Windows Pod 没有为 ICMP 协议编写出站规则,但 TCP/UDP 是支持的。当试图演示与集群外部资源的连接时,可以把 `ping <IP>` 替换为 `curl <IP>` 命令。
99+
100+
如果你仍然遇到问题,很可能你需要额外关注
101+
[cni.conf](https://github.com/Microsoft/SDN/blob/master/Kubernetes/flannel/l2bridge/cni/config/cni.conf)
102+
的配置。你可以随时编辑这个静态文件。更新配置将应用于新的 Kubernetes 资源。
103+
104+
Kubernetes 的网络需求之一 (查看 [Kubernetes 模型](/zh/docs/concepts/cluster-administration/networking/))
105+
是集群通信不需要内部的 NAT。
106+
为了遵守这一要求, 对于你不希望发生的出站 NAT 通信,这里有一个
107+
[ExceptionList](https://github.com/Microsoft/SDN/blob/master/Kubernetes/flannel/l2bridge/cni/config/cni.conf#L20)
108+
然而,这也意味着你需要从 `ExceptionList` 中去掉你试图查询的外部IP。
109+
只有这样,来自你的 Windows Pod 的流量才会被正确地 SNAT 转换,以接收来自外部环境的响应。
110+
就此而言,你的 `cni.conf` 中的 `ExceptionList` 应该如下所示:
111+
112+
```conf
113+
"ExceptionList": [
114+
"10.244.0.0/16", # Cluster subnet
115+
"10.96.0.0/12", # Service subnet
116+
"10.127.130.0/24" # Management (host) subnet
117+
]
118+
```
119+
<!--
120+
3. My Windows node cannot access `NodePort` type Services
121+
122+
Local NodePort access from the node itself fails. This is a known
123+
limitation. NodePort access works from other nodes or external clients.
124+
125+
4. vNICs and HNS endpoints of containers are being deleted
126+
127+
This issue can be caused when the `hostname-override` parameter is not passed to
128+
[kube-proxy](/docs/reference/command-line-tools-reference/kube-proxy/). To resolve
129+
it, users need to pass the hostname to kube-proxy as follows:
130+
131+
```powershell
132+
C:\k\kube-proxy.exe --hostname-override=$(hostname)
133+
```
134+
-->
135+
3. 我的 Windows 节点无法访问 `NodePort` 类型服务
136+
137+
从节点本身访问本地 NodePort 失败,是一个已知的限制。你可以从其他节点或外部客户端正常访问 NodePort。
138+
139+
4. 容器的 vnic 和 HNS endpoints 正在被删除
140+
141+
`hostname-override` 参数没有传递给 [kube-proxy](/zh/docs/reference/command-line-tools-reference/kube-proxy/)
142+
时可能引发这一问题。想要解决这个问题,用户需要将主机名传递给 kube-proxy,如下所示:
143+
144+
```powershell
145+
C:\k\kube-proxy.exe --hostname-override=$(hostname)
146+
```
147+
<!--
148+
5. My Windows node cannot access my services using the service IP
149+
150+
This is a known limitation of the networking stack on Windows. However, Windows Pods can access the Service IP.
151+
152+
6. No network adapter is found when starting the kubelet
153+
154+
The Windows networking stack needs a virtual adapter for Kubernetes networking to work.
155+
If the following commands return no results (in an admin shell),
156+
virtual network creation — a necessary prerequisite for the kubelet to work — has failed:
157+
158+
```powershell
159+
Get-HnsNetwork | ? Name -ieq "cbr0"
160+
Get-NetAdapter | ? Name -Like "vEthernet (Ethernet*"
161+
```
162+
163+
Often it is worthwhile to modify the [InterfaceName](https://github.com/microsoft/SDN/blob/master/Kubernetes/flannel/start.ps1#L7) parameter of the start.ps1 script,
164+
in cases where the host's network adapter isn't "Ethernet".
165+
Otherwise, consult the output of the `start-kubelet.ps1` script to see if there are errors during virtual network creation.
166+
-->
167+
5. 我的 Windows 节点无法通过服务 IP 访问我的服务
168+
169+
这是 Windows 上网络栈的一个已知限制。但是 Windows Pod 可以访问 Service IP。
170+
171+
6. 启动 kubelet 时找不到网络适配器
172+
173+
Windows 网络栈需要一个虚拟适配器才能使 Kubernetes 网络工作。
174+
如果以下命令没有返回结果(在管理员模式的 shell 中),
175+
则意味着创建虚拟网络失败,而虚拟网络的存在是 kubelet 正常工作前提:
176+
177+
```powershell
178+
Get-HnsNetwork | ? Name -ieq "cbr0"
179+
Get-NetAdapter | ? Name -Like "vEthernet (Ethernet*"
180+
```
181+
182+
如果主机的网络适配器不是 "Ethernet",通常有必要修改 `start.ps1` 脚本的
183+
[InterfaceName](https://github.com/microsoft/SDN/blob/master/Kubernetes/flannel/start.ps1#L7) 参数。
184+
否则,如果虚拟网络创建过程出错,请检查 `start-kubelet.ps1` 脚本的输出。
185+
<!--
186+
7. DNS resolution is not properly working
187+
188+
Check the DNS limitations for Windows in this [section](#dns-limitations).
189+
190+
8. `kubectl port-forward` fails with "unable to do port forwarding: wincat not found"
191+
192+
This was implemented in Kubernetes 1.15 by including `wincat.exe` in the pause infrastructure container `mcr.microsoft.com/oss/kubernetes/pause:3.6`.
193+
Be sure to use a supported version of Kubernetes.
194+
If you would like to build your own pause infrastructure container be sure to include [wincat](https://github.com/kubernetes/kubernetes/tree/master/build/pause/windows/wincat).
195+
-->
196+
7. DNS 解析工作异常
197+
198+
[本节](#dns-limitations)中了解 Windows 系统上的 DNS 限制。
199+
200+
8. `kubectl port-forward` 失败,错误为 "unable to do port forwarding: wincat not found"
201+
202+
在 Kubernetes 1.15 中,pause 基础架构容器 `mcr.microsoft.com/oss/kubernetes/pause:3.6`
203+
中包含 `wincat.exe` 来实现端口转发。
204+
请确保使用 Kubernetes 的受支持版本。如果你想构建自己的 pause 基础架构容器,
205+
请确保其中包含 [wincat](https://github.com/kubernetes/kubernetes/tree/master/build/pause/windows/wincat)
206+
<!--
207+
9. My Kubernetes installation is failing because my Windows Server node is behind a proxy
208+
209+
If you are behind a proxy, the following PowerShell environment variables must be defined:
210+
211+
```PowerShell
212+
[Environment]::SetEnvironmentVariable("HTTP_PROXY", "http://proxy.example.com:80/", [EnvironmentVariableTarget]::Machine)
213+
[Environment]::SetEnvironmentVariable("HTTPS_PROXY", "http://proxy.example.com:443/", [EnvironmentVariableTarget]::Machine)
214+
```
215+
-->
216+
9. 我的 Kubernetes 安装失败,因为我的 Windows 服务器节点使用了代理服务器
217+
218+
如果使用了代理服务器,必须定义下面的 PowerShell 环境变量:
219+
220+
```PowerShell
221+
[Environment]::SetEnvironmentVariable("HTTP_PROXY", "http://proxy.example.com:80/", [EnvironmentVariableTarget]::Machine)
222+
[Environment]::SetEnvironmentVariable("HTTPS_PROXY", "http://proxy.example.com:443/", [EnvironmentVariableTarget]::Machine)
223+
```
224+
<!--
225+
### Flannel troubleshooting
226+
227+
1. With Flannel, my nodes are having issues after rejoining a cluster
228+
229+
Whenever a previously deleted node is being re-joined to the cluster, flannelD
230+
tries to assign a new pod subnet to the node. Users should remove the old pod
231+
subnet configuration files in the following paths:
232+
233+
```powershell
234+
Remove-Item C:\k\SourceVip.json
235+
Remove-Item C:\k\SourceVipRequest.json
236+
```
237+
-->
238+
## Flannel 故障排查 {#troubleshooting-network}
239+
240+
1. 使用 Flannel 时,我的节点在重新加入集群后出现问题
241+
242+
当先前删除的节点重新加入集群时, flannelD 尝试为节点分配一个新的 Pod 子网。
243+
用户应该在以下路径中删除旧的 Pod 子网配置文件:
244+
245+
```powershell
246+
Remove-Item C:\k\SourceVip.json
247+
Remove-Item C:\k\SourceVipRequest.json
248+
```
249+
<!--
250+
2. Flanneld is stuck in "Waiting for the Network to be created"
251+
252+
There are numerous reports of this [issue](https://github.com/coreos/flannel/issues/1066);
253+
most likely it is a timing issue for when the management IP of the flannel network is set.
254+
A workaround is to relaunch `start.ps1` or relaunch it manually as follows:
255+
256+
```powershell
257+
[Environment]::SetEnvironmentVariable("NODE_NAME", "<Windows_Worker_Hostname>")
258+
C:\flannel\flanneld.exe --kubeconfig-file=c:\k\config --iface=<Windows_Worker_Node_IP> --ip-masq=1 --kube-subnet-mgr=1
259+
```
260+
-->
261+
2. Flanneld 卡在 "Waiting for the Network to be created"
262+
263+
关于这个[问题](https://github.com/coreos/flannel/issues/1066)有很多报告 ;
264+
很可能是 flannel 网络管理 IP 的设置时机问题。
265+
一个变通方法是重新启动 `start.ps1` 或按如下方式手动重启:
266+
267+
```powershell
268+
[Environment]::SetEnvironmentVariable("NODE_NAME", "<Windows 工作节点主机名>")
269+
C:\flannel\flanneld.exe --kubeconfig-file=c:\k\config --iface=<Windows 工作节点 IP> --ip-masq=1 --kube-subnet-mgr=1
270+
```
271+
<!--
272+
3. My Windows Pods cannot launch because of missing `/run/flannel/subnet.env`
273+
274+
This indicates that Flannel didn't launch correctly. You can either try
275+
to restart `flanneld.exe` or you can copy the files over manually from
276+
`/run/flannel/subnet.env` on the Kubernetes master to `C:\run\flannel\subnet.env`
277+
on the Windows worker node and modify the `FLANNEL_SUBNET` row to a different
278+
number. For example, if node subnet 10.244.4.1/24 is desired:
279+
280+
```env
281+
FLANNEL_NETWORK=10.244.0.0/16
282+
FLANNEL_SUBNET=10.244.4.1/24
283+
FLANNEL_MTU=1500
284+
FLANNEL_IPMASQ=true
285+
```
286+
-->
287+
3. 我的 Windows Pod 无法启动,因为缺少 `/run/flannel/subnet.env`
288+
289+
这表明 Flannel 没有正确启动。你可以尝试重启`flanneld.exe` 或者你可以将 Kubernetes 控制节点的
290+
`/run/flannel/subnet.env` 文件手动拷贝到 Windows 工作节点上,放在 `C:\run\flannel\subnet.env`
291+
并且将 `FLANNEL_SUBNET` 行修改为不同取值。例如,如果期望节点子网为 10.244.4.1/24:
292+
293+
```env
294+
FLANNEL_NETWORK=10.244.0.0/16
295+
FLANNEL_SUBNET=10.244.4.1/24
296+
FLANNEL_MTU=1500
297+
FLANNEL_IPMASQ=true
298+
```
299+
<!--
300+
### Further investigation
301+
302+
If these steps don't resolve your problem, you can get help running Windows containers on Windows nodes in Kubernetes through:
303+
304+
* StackOverflow [Windows Server Container](https://stackoverflow.com/questions/tagged/windows-server-container) topic
305+
* Kubernetes Official Forum [discuss.kubernetes.io](https://discuss.kubernetes.io/)
306+
* Kubernetes Slack [#SIG-Windows Channel](https://kubernetes.slack.com/messages/sig-windows)
307+
-->
308+
### 进一步探查 {#further-investigation}
309+
310+
如果这些步骤都不能解决你的问题,你可以通过以下方式获得关于在 Kubernetes 中运行 Windows 容器的帮助:
311+
312+
* StackOverflow [Windows Server Container](https://stackoverflow.com/questions/tagged/windows-server-container) topic
313+
* Kubernetes 官方论坛 [discuss.kubernetes.io](https://discuss.kubernetes.io/)
314+
* Kubernetes Slack [#SIG-Windows Channel](https://kubernetes.slack.com/messages/sig-windows)

0 commit comments

Comments
 (0)