@@ -145,87 +145,7 @@ At the moment, that controller can add labels for:
145
145
如果你在使用 AMD GPU,你可以部署
146
146
[ Node Labeller] ( https://github.com/RadeonOpenCompute/k8s-device-plugin/tree/master/cmd/k8s-node-labeller ) ,
147
147
它是一个 {{< glossary_tooltip text="控制器" term_id="controller" >}},
148
- 会自动给节点打上 GPU 设备属性标签。目前支持的属性:
149
-
150
- <!--
151
- * Device ID (-device-id)
152
- * VRAM Size (-vram)
153
- * Number of SIMD (-simd-count)
154
- * Number of Compute Unit (-cu-count)
155
- * Firmware and Feature Versions (-firmware)
156
- * GPU Family, in two letters acronym (-family)
157
- * SI - Southern Islands
158
- * CI - Sea Islands
159
- * KV - Kaveri
160
- * VI - Volcanic Islands
161
- * CZ - Carrizo
162
- * AI - Arctic Islands
163
- * RV - Raven
164
- --->
165
- * 设备 ID (-device-id)
166
- * VRAM 大小 (-vram)
167
- * SIMD 数量(-simd-count)
168
- * 计算单位数量(-cu-count)
169
- * 固件和特性版本 (-firmware)
170
- * GPU 系列,两个字母的首字母缩写(-family)
171
- * SI - Southern Islands
172
- * CI - Sea Islands
173
- * KV - Kaveri
174
- * VI - Volcanic Islands
175
- * CZ - Carrizo
176
- * AI - Arctic Islands
177
- * RV - Raven
178
-
179
- ``` shell
180
- kubectl describe node cluster-node-23
181
- ```
182
-
183
- ```
184
- Name: cluster-node-23
185
- Roles: <none>
186
- Labels: beta.amd.com/gpu.cu-count.64=1
187
- beta.amd.com/gpu.device-id.6860=1
188
- beta.amd.com/gpu.family.AI=1
189
- beta.amd.com/gpu.simd-count.256=1
190
- beta.amd.com/gpu.vram.16G=1
191
- kubernetes.io/arch=amd64
192
- kubernetes.io/os=linux
193
- kubernetes.io/hostname=cluster-node-23
194
- Annotations: node.alpha.kubernetes.io/ttl: 0
195
- …
196
- ```
197
-
198
- <!--
199
- With the Node Labeller in use, you can specify the GPU type in the Pod spec:
200
- -->
201
- 使用了 Node Labeller 的时候,你可以在 Pod 的规约中指定 GPU 的类型:
202
-
203
- ``` yaml
204
- apiVersion : v1
205
- kind : Pod
206
- metadata :
207
- name : cuda-vector-add
208
- spec :
209
- restartPolicy : OnFailure
210
- containers :
211
- - name : cuda-vector-add
212
- # https://github.com/kubernetes/kubernetes/blob/v1.7.11/test/images/nvidia-cuda/Dockerfile
213
- image : " registry.k8s.io/cuda-vector-add:v0.1"
214
- resources :
215
- limits :
216
- nvidia.com/gpu : 1
217
- affinity :
218
- nodeAffinity :
219
- requiredDuringSchedulingIgnoredDuringExecution :
220
- nodeSelectorTerms :
221
- – matchExpressions :
222
- – key : beta.amd.com/gpu.family.AI # Arctic Islands GPU 系列
223
- operator : Exist
224
- ` ` `
225
-
226
- <!--
227
- This ensures that the Pod will be scheduled to a node that has the GPU type
228
- you specified.
229
- -->
230
- 这能够保证 Pod 能够被调度到你所指定类型的 GPU 的节点上去。
148
+ 会自动给节点打上 GPU 设备属性标签。
231
149
150
+ 对于 NVIDIA GPU,[ GPU feature discovery] ( https://github.com/NVIDIA/gpu-feature-discovery/blob/main/README.md )
151
+ 提供了类似功能。
0 commit comments