@@ -448,31 +448,20 @@ kubectl describe pod etcd-with-grpc
448
448
```
449
449
450
450
<!--
451
- Before Kubernetes 1.23, gRPC health probes were often implemented using
452
- [grpc-health-probe](https://github.com/grpc-ecosystem/grpc-health-probe/),
453
- as described in the blog post
454
- [Health checking gRPC servers on Kubernetes](/blog/2018/10/01/health-checking-grpc-servers-on-kubernetes/).
455
- The built-in gRPC probe's behavior is similar to the one implemented by grpc-health-probe.
456
- When migrating from grpc-health-probe to built-in probes, remember the following differences:
457
- -->
458
- 在 Kubernetes 1.23 之前,gRPC 健康探测通常使用
459
- [ grpc-health-probe] ( https://github.com/grpc-ecosystem/grpc-health-probe/ )
460
- 来实现,如博客 [ Health checking gRPC servers on Kubernetes(对 Kubernetes 上的 gRPC 服务器执行健康检查)] ( /blog/2018/10/01/health-checking-grpc-servers-on-kubernetes/ ) 所描述。
461
- 内置的 gRPC 探针的行为与 ` grpc-health-probe ` 所实现的行为类似。
462
- 从 ` grpc-health-probe ` 迁移到内置探针时,请注意以下差异:
463
-
464
- <!--
465
- - Built-in probes run against the pod IP address, unlike grpc-health-probe that often runs against
466
- `127.0.0.1`. Be sure to configure your gRPC endpoint to listen on the Pod's IP address.
467
- - Built-in probes do not support any authentication parameters (like `-tls`).
451
+ When using a gRPC probe, there are some technical details to be aware of:
452
+
453
+ - The probes run against the pod IP address or its hostname.
454
+ Be sure to configure your gRPC endpoint to listen on the Pod's IP address.
455
+ - The probes do not support any authentication parameters (like `-tls`).
468
456
- There are no error codes for built-in probes. All errors are considered as probe failures.
469
457
- If `ExecProbeTimeout` feature gate is set to `false`, grpc-health-probe does **not**
470
458
respect the `timeoutSeconds` setting (which defaults to 1s), while built-in probe would fail on timeout.
471
459
-->
472
- - 内置探针运行时针对的是 Pod 的 IP 地址,不像 ` grpc-health-probe `
473
- 那样通常针对 ` 127.0.0.1 ` 执行探测;
460
+ 当使用 gRPC 探针时,需要注意以下一些技术细节:
461
+
462
+ - 这些探针运行时针对的是 Pod 的 IP 地址或其主机名。
474
463
请一定配置你的 gRPC 端点使之监听于 Pod 的 IP 地址之上。
475
- - 内置探针不支持任何身份认证参数 (例如 ` -tls ` )。
464
+ - 这些探针不支持任何身份认证参数 (例如 ` -tls ` )。
476
465
- 对于内置的探针而言,不存在错误代码。所有错误都被视作探测失败。
477
466
- 如果 ` ExecProbeTimeout ` 特性门控被设置为 ` false ` ,则 ` grpc-health-probe `
478
467
不会考虑 ` timeoutSeconds ` 设置状态(默认值为 1s),
@@ -514,7 +503,7 @@ In such cases, it can be tricky to set up liveness probe parameters without
514
503
compromising the fast response to deadlocks that motivated such a probe.
515
504
The trick is to set up a startup probe with the same command, HTTP or TCP
516
505
check, with a ` failureThreshold * periodSeconds` long enough to cover the
517
- worse case startup time.
506
+ worst case startup time.
518
507
519
508
So, the previous example would become :
520
509
-->
@@ -523,7 +512,7 @@ So, the previous example would become:
523
512
有时候,会有一些现有的应用在启动时需要较长的初始化时间。
524
513
要这种情况下,若要不影响对死锁作出快速响应的探测,设置存活探测参数是要技巧的。
525
514
技巧就是使用相同的命令来设置启动探测,针对 HTTP 或 TCP 检测,可以通过将
526
- ` failureThreshold * periodSeconds` 参数设置为足够长的时间来应对糟糕情况下的启动时间 。
515
+ ` failureThreshold * periodSeconds` 参数设置为足够长的时间来应对最糟糕情况下的启动时间 。
527
516
528
517
这样,前面的例子就变成了:
529
518
@@ -697,42 +686,6 @@ liveness and readiness checks:
697
686
默认值是继承 Pod 级别的 `terminationGracePeriodSeconds` 值(如果不设置则为 30 秒),最小值为 1。
698
687
更多细节请参见[探针级别 `terminationGracePeriodSeconds`](#probe-level-terminationgraceperiodseconds)。
699
688
700
- {{< note >}}
701
- <!--
702
- Before Kubernetes 1.20, the field `timeoutSeconds` was not respected for exec probes :
703
- probes continued running indefinitely, even past their configured deadline,
704
- until a result was returned.
705
- -->
706
- 在 Kubernetes 1.20 版本之前,`exec` 探针会忽略 `timeoutSeconds`:
707
- 探针会无限期地持续运行,甚至可能超过所配置的限期,直到返回结果为止。
708
-
709
- <!--
710
- This defect was corrected in Kubernetes v1.20. You may have been relying on the previous behavior,
711
- even without realizing it, as the default timeout is 1 second.
712
- As a cluster administrator, you can disable the [feature gate](/docs/reference/command-line-tools-reference/feature-gates/)
713
- ` ExecProbeTimeout` (set it to `false`) on each kubelet to restore the behavior from older versions,
714
- then remove that override once all the exec probes in the cluster have a `timeoutSeconds` value set.
715
- If you have pods that are impacted from the default 1 second timeout, you should update their
716
- probe timeout so that you're ready for the eventual removal of that feature gate.
717
- -->
718
- 这一缺陷在 Kubernetes v1.20 版本中得到修复。你可能一直依赖于之前错误的探测行为,
719
- 甚至都没有觉察到这一问题的存在,因为默认的超时值是 1 秒钟。
720
- 作为集群管理员,你可以在所有的 kubelet 上禁用 `ExecProbeTimeout`
721
- [特性门控](/zh-cn/docs/reference/command-line-tools-reference/feature-gates/)
722
- (将其设置为 `false`),从而恢复之前版本中的运行行为。之后当集群中所有的
723
- exec 探针都设置了 `timeoutSeconds` 参数后,移除此标志重载。
724
- 如果你有 Pod 受到此默认 1 秒钟超时值的影响,你应该更新这些 Pod 对应的探针的超时值,
725
- 这样才能为最终去除该特性门控做好准备。
726
-
727
- <!--
728
- With the fix of the defect, for exec probes, on Kubernetes `1.20+` with the `dockershim` container runtime,
729
- the process inside the container may keep running even after probe returned failure because of the timeout.
730
- -->
731
- 当此缺陷被修复之后,在使用 `dockershim` 容器运行时的 Kubernetes `1.20+`
732
- 版本中,对于 exec 探针而言,容器中的进程可能会因为超时值的设置保持持续运行,
733
- 即使探针返回了失败状态。
734
- {{< /note >}}
735
-
736
689
{{< caution >}}
737
690
<!--
738
691
Incorrect implementation of readiness probes may result in an ever growing number
@@ -854,18 +807,6 @@ to resolve it.
854
807
855
808
{{< feature-state for_k8s_version="v1.28" state="stable" >}}
856
809
857
- <!--
858
- Prior to release 1.21, the Pod-level `terminationGracePeriodSeconds` was used
859
- for terminating a container that failed its liveness or startup probe. This
860
- coupling was unintended and may have resulted in failed containers taking an
861
- unusually long time to restart when a Pod-level `terminationGracePeriodSeconds`
862
- was set.
863
- -->
864
- 在 1.21 发行版之前,Pod 层面的 `terminationGracePeriodSeconds`
865
- 被用来终止存活探测或启动探测失败的容器。
866
- 这一行为上的关联不是我们想要的,可能导致 Pod 层面设置了 `terminationGracePeriodSeconds`
867
- 时容器要花非常长的时间才能重新启动。
868
-
869
810
<!--
870
811
In 1.25 and above, users can specify a probe-level `terminationGracePeriodSeconds`
871
812
as part of the probe specification. When both a pod- and probe-level
@@ -877,19 +818,14 @@ as part of the probe specification. When both a pod- and probe-level
877
818
都已设置,kubelet 将使用探针层面设置的值。
878
819
879
820
<!--
880
- Beginning in Kubernetes 1.25, the `ProbeTerminationGracePeriod` feature is enabled
881
- by default. For users choosing to disable this feature, please note the following :
821
+ When setting the `terminationGracePeriodSeconds`, please note the following :
882
822
883
- * The `ProbeTerminationGracePeriod` feature gate is only available on the API Server.
884
- The kubelet always honors the probe-level `terminationGracePeriodSeconds` field if
823
+ * The kubelet always honors the probe-level `terminationGracePeriodSeconds` field if
885
824
it is present on a Pod.
886
825
-->
887
- {{< note >}}
888
- 从 Kubernetes 1.25 开始,默认启用 `ProbeTerminationGracePeriod` 特性。
889
- 选择禁用此特性的用户,请注意以下事项:
826
+ 当设置 `terminationGracePeriodSeconds` 时,请注意以下事项:
890
827
891
- * `ProbeTerminationGracePeriod` 特性门控只能用在 API 服务器上。
892
- kubelet 始终优先选用探针级别 `terminationGracePeriodSeconds` 字段
828
+ * kubelet 始终优先选用探针级别 `terminationGracePeriodSeconds` 字段
893
829
(如果它存在于 Pod 上)。
894
830
895
831
<!--
@@ -900,17 +836,6 @@ by default. For users choosing to disable this feature, please note the followin
900
836
* 如果你已经为现有 Pod 设置了 `terminationGracePeriodSeconds`
901
837
字段并且不再希望使用针对每个探针的终止宽限期,则必须删除现有的这类 Pod。
902
838
903
- <!--
904
- * When you (or the control plane, or some other component) create replacement
905
- Pods, and the feature gate `ProbeTerminationGracePeriod` is disabled, then the
906
- API server ignores the Probe-level `terminationGracePeriodSeconds` field, even if
907
- a Pod or pod template specifies it.
908
- -->
909
- * 当你(或控制平面或某些其他组件)创建替换 Pod,并且特性门控 `ProbeTerminationGracePeriod`
910
- 被禁用时,即使 Pod 或 Pod 模板指定了 `terminationGracePeriodSeconds` 字段,
911
- API 服务器也会忽略探针级别的 `terminationGracePeriodSeconds` 字段设置。
912
- {{< /note >}}
913
-
914
839
<!--
915
840
For example :
916
841
-->
0 commit comments