@@ -62,13 +62,6 @@ During the registration, the device plugin needs to send:
62
62
[extended resource naming scheme](/docs/concepts/configuration/manage-resources-containers/#extended-resources)
63
63
as `vendor-domain/resourcetype`.
64
64
(For example, an NVIDIA GPU is advertised as `nvidia.com/gpu`.)
65
-
66
- Following a successful registration, the device plugin sends the kubelet the
67
- list of devices it manages, and the kubelet is then in charge of advertising those
68
- resources to the API server as part of the kubelet node status update.
69
- For example, after a device plugin registers `hardware-vendor.example/foo` with the kubelet
70
- and reports two healthy devices on a node, the node status is updated
71
- to advertise that the node has 2 "Foo" devices installed and available.
72
65
-->
73
66
设备插件可以通过此 gRPC 服务在 kubelet 进行注册。在注册期间,设备插件需要发送下面几样内容:
74
67
@@ -78,6 +71,14 @@ to advertise that the node has 2 "Foo" devices installed and available.
78
71
需要遵循[ 扩展资源命名方案] ( /zh-cn/docs/concepts/configuration/manage-resources-containers/#extended-resources ) ,
79
72
类似于 ` vendor-domain/resourcetype ` 。(比如 NVIDIA GPU 就被公布为 ` nvidia.com/gpu ` 。)
80
73
74
+ <!--
75
+ Following a successful registration, the device plugin sends the kubelet the
76
+ list of devices it manages, and the kubelet is then in charge of advertising those
77
+ resources to the API server as part of the kubelet node status update.
78
+ For example, after a device plugin registers `hardware-vendor.example/foo` with the kubelet
79
+ and reports two healthy devices on a node, the node status is updated
80
+ to advertise that the node has 2 "Foo" devices installed and available.
81
+ -->
81
82
成功注册后,设备插件就向 kubelet 发送它所管理的设备列表,然后 kubelet
82
83
负责将这些资源发布到 API 服务器,作为 kubelet 节点状态更新的一部分。
83
84
@@ -114,13 +115,27 @@ on certain nodes. Here is an example of a pod requesting this resource to run a
114
115
下面就是一个 Pod 示例,请求此资源以运行一个工作负载的示例:
115
116
116
117
<!--
118
+ ```yaml
119
+ ---
120
+ apiVersion: v1
121
+ kind: Pod
122
+ metadata:
123
+ name: demo-pod
124
+ spec:
125
+ containers:
126
+ - name: demo-container-1
127
+ image: registry.k8s.io/pause:3.8
128
+ resources:
129
+ limits:
130
+ hardware-vendor.example/foo: 2
117
131
#
118
132
# This Pod needs 2 of the hardware-vendor.example/foo devices
119
133
# and can only schedule onto a Node that's able to satisfy
120
134
# that need.
121
135
#
122
136
# If the Node has more than 2 of those devices available, the
123
137
# remainder would be available for other Pods to use.
138
+ ```
124
139
-->
125
140
``` yaml
126
141
---
@@ -511,15 +526,17 @@ CPU ID、设备插件所报告的设备 ID 以及这些设备分配所处的 NUM
511
526
512
527
<!--
513
528
Starting from Kubernetes v1.27, the `List` endpoint can provide information on resources
514
- of running pods allocated in `ResourceClaims` by the `DynamicResourceAllocation` API. To enable
515
- this feature `kubelet` must be started with the following flags:
529
+ of running pods allocated in `ResourceClaims` by the `DynamicResourceAllocation` API.
530
+ Starting from Kubernetes v1.34, this feature is enabled by default.
531
+ To disable, `kubelet` must be started with the following flags:
516
532
-->
517
533
从 Kubernetes v1.27 开始,` List ` 端点可以通过 ` DynamicResourceAllocation ` API 提供在
518
534
` ResourceClaims ` 中分配的当前运行 Pod 的资源信息。
519
- 要启用此特性,必须使用以下标志启动 ` kubelet ` :
535
+ 从 Kubernetes v1.34 开始,此特性默认启用。
536
+ 要禁用此特性,必须使用以下标志启动 ` kubelet ` :
520
537
521
538
```
522
- --feature-gates=DynamicResourceAllocation=true, KubeletPodResourcesDynamicResources=true
539
+ --feature-gates=KubeletPodResourcesDynamicResources=false
523
540
```
524
541
525
542
<!--
@@ -785,7 +802,7 @@ will continue working.
785
802
-->
786
803
### ` Get ` gRPC 端点 {#grpc-endpoint-get}
787
804
788
- {{< feature-state state="alpha " for_k8s_version="v1.27 " >}}
805
+ {{< feature-state state="beta " for_k8s_version="v1.34 " >}}
789
806
790
807
<!--
791
808
The `Get` endpoint provides information on resources of a running Pod. It exposes information
@@ -813,24 +830,26 @@ message GetPodResourcesRequest {
813
830
```
814
831
815
832
<!--
816
- To enable this feature, you must start your kubelet services with the following flag:
833
+ To disable this feature, you must start your kubelet services with the following flag:
817
834
-->
818
- 要启用此特性 ,你必须使用以下标志启动 kubelet 服务:
835
+ 要禁用此特性 ,你必须使用以下标志启动 kubelet 服务:
819
836
820
837
```
821
- --feature-gates=KubeletPodResourcesGet=true
838
+ --feature-gates=KubeletPodResourcesGet=false
822
839
```
823
840
824
841
<!--
825
842
The `Get` endpoint can provide Pod information related to dynamic resources
826
- allocated by the dynamic resource allocation API. To enable this feature, you must
827
- ensure your kubelet services are started with the following flags:
843
+ allocated by the dynamic resource allocation API.
844
+ Starting from Kubernetes v1.34, this feature is enabled by default.
845
+ To disable, `kubelet` must be started with the following flags:
828
846
-->
829
847
` Get ` 端点可以提供与动态资源分配 API 所分配的动态资源相关的 Pod 信息。
830
- 要启用此特性,你必须确保使用以下标志启动 kubelet 服务:
848
+ 从 Kubernetes v1.34 开始,此特性已默认启用。
849
+ 要禁用此特性,你必须确保使用以下标志启动 kubelet 服务:
831
850
832
851
```
833
- --feature-gates=KubeletPodResourcesGet=true,DynamicResourceAllocation=true, KubeletPodResourcesDynamicResources=true
852
+ --feature-gates=KubeletPodResourcesDynamicResources=false
834
853
```
835
854
836
855
<!--
@@ -919,11 +938,13 @@ Here are some examples of device plugin implementations:
919
938
* [ Akri] ( https://github.com/project-akri/akri ) ,它可以让你轻松公开异构叶子设备(例如 IP 摄像机和 USB 设备)。
920
939
* [ AMD GPU 设备插件] ( https://github.com/ROCm/k8s-device-plugin )
921
940
* 适用于通用 Linux 设备和 USB 设备的[ 通用设备插件] ( https://github.com/squat/generic-device-plugin )
922
- * 用于异构 AI 计算虚拟化中间件(例如 NVIDIA、Cambricon、Hygon、Iluvatar、MThreads、Ascend、Metax 设备)的 [ HAMi] ( https://github.com/Project-HAMi/HAMi )
941
+ * 用于异构 AI 计算虚拟化中间件(例如 NVIDIA、Cambricon、Hygon、Iluvatar、MThreads、Ascend、Metax 设备)的
942
+ [ HAMi] ( https://github.com/Project-HAMi/HAMi )
923
943
* [ Intel 设备插件] ( https://github.com/intel/intel-device-plugins-for-kubernetes ) 支持
924
944
Intel GPU、FPGA、QAT、VPU、SGX、DSA、DLB 和 IAA 设备
925
945
* [ KubeVirt 设备插件] ( https://github.com/kubevirt/kubernetes-device-plugins ) 用于硬件辅助的虚拟化
926
- * [ NVIDIA GPU 设备插件] ( https://github.com/NVIDIA/k8s-device-plugin ) NVIDIA 的官方设备插件,用于公布 NVIDIA GPU 和监控 GPU 健康状态。
946
+ * [ NVIDIA GPU 设备插件] ( https://github.com/NVIDIA/k8s-device-plugin ) NVIDIA 的官方设备插件,
947
+ 用于公布 NVIDIA GPU 和监控 GPU 健康状态。
927
948
* [ 为 Container-Optimized OS 所提供的 NVIDIA GPU 设备插件] ( https://github.com/GoogleCloudPlatform/container-engine-accelerators/tree/master/cmd/nvidia_gpu )
928
949
* [ RDMA 设备插件] ( https://github.com/hustcat/k8s-rdma-device-plugin )
929
950
* [ SocketCAN 设备插件] ( https://github.com/collabora/k8s-socketcan )
@@ -941,8 +962,10 @@ Here are some examples of device plugin implementations:
941
962
* Learn about the [Topology Manager](/docs/tasks/administer-cluster/topology-manager/)
942
963
* Read about using [hardware acceleration for TLS ingress](/blog/2019/04/24/hardware-accelerated-ssl/tls-termination-in-ingress-controllers-using-kubernetes-device-plugins-and-runtimeclass/)
943
964
with Kubernetes
965
+ * Read more about [Extended Resource allocation by DRA](/docs/concepts/scheduling-eviction/dynamic-resource-allocation/#extended-resource)
944
966
-->
945
967
* 查看[ 调度 GPU 资源] ( /zh-cn/docs/tasks/manage-gpus/scheduling-gpus/ ) 来学习使用设备插件
946
968
* 查看在节点上如何[ 公布扩展资源] ( /zh-cn/docs/tasks/administer-cluster/extended-resource-node/ )
947
969
* 学习[ 拓扑管理器] ( /zh-cn/docs/tasks/administer-cluster/topology-manager/ )
948
970
* 阅读如何在 Kubernetes 中使用 [ TLS Ingress 的硬件加速] ( /zh-cn/blog/2019/04/24/hardware-accelerated-ssl/tls-termination-in-ingress-controllers-using-kubernetes-device-plugins-and-runtimeclass/ )
971
+ * 阅读更多关于[ 使用 DRA 分配扩展资源] ( /zh-cn/docs/concepts/scheduling-eviction/dynamic-resource-allocation/#extended-resource )
0 commit comments