Skip to content

Commit 4ee7668

Browse files
committed
[zh] sync scheduling-gpus.md
Signed-off-by: SSmallMonster <[email protected]>
1 parent e9bfdf6 commit 4ee7668

File tree

1 file changed

+3
-83
lines changed

1 file changed

+3
-83
lines changed

content/zh-cn/docs/tasks/manage-gpus/scheduling-gpus.md

Lines changed: 3 additions & 83 deletions
Original file line numberDiff line numberDiff line change
@@ -145,87 +145,7 @@ At the moment, that controller can add labels for:
145145
如果你在使用 AMD GPU,你可以部署
146146
[Node Labeller](https://github.com/RadeonOpenCompute/k8s-device-plugin/tree/master/cmd/k8s-node-labeller)
147147
它是一个 {{< glossary_tooltip text="控制器" term_id="controller" >}},
148-
会自动给节点打上 GPU 设备属性标签。目前支持的属性:
149-
150-
<!--
151-
* Device ID (-device-id)
152-
* VRAM Size (-vram)
153-
* Number of SIMD (-simd-count)
154-
* Number of Compute Unit (-cu-count)
155-
* Firmware and Feature Versions (-firmware)
156-
* GPU Family, in two letters acronym (-family)
157-
* SI - Southern Islands
158-
* CI - Sea Islands
159-
* KV - Kaveri
160-
* VI - Volcanic Islands
161-
* CZ - Carrizo
162-
* AI - Arctic Islands
163-
* RV - Raven
164-
--->
165-
* 设备 ID (-device-id)
166-
* VRAM 大小 (-vram)
167-
* SIMD 数量(-simd-count)
168-
* 计算单位数量(-cu-count)
169-
* 固件和特性版本 (-firmware)
170-
* GPU 系列,两个字母的首字母缩写(-family)
171-
* SI - Southern Islands
172-
* CI - Sea Islands
173-
* KV - Kaveri
174-
* VI - Volcanic Islands
175-
* CZ - Carrizo
176-
* AI - Arctic Islands
177-
* RV - Raven
178-
179-
```shell
180-
kubectl describe node cluster-node-23
181-
```
182-
183-
```
184-
Name: cluster-node-23
185-
Roles: <none>
186-
Labels: beta.amd.com/gpu.cu-count.64=1
187-
beta.amd.com/gpu.device-id.6860=1
188-
beta.amd.com/gpu.family.AI=1
189-
beta.amd.com/gpu.simd-count.256=1
190-
beta.amd.com/gpu.vram.16G=1
191-
kubernetes.io/arch=amd64
192-
kubernetes.io/os=linux
193-
kubernetes.io/hostname=cluster-node-23
194-
Annotations: node.alpha.kubernetes.io/ttl: 0
195-
196-
```
197-
198-
<!--
199-
With the Node Labeller in use, you can specify the GPU type in the Pod spec:
200-
-->
201-
使用了 Node Labeller 的时候,你可以在 Pod 的规约中指定 GPU 的类型:
202-
203-
```yaml
204-
apiVersion: v1
205-
kind: Pod
206-
metadata:
207-
name: cuda-vector-add
208-
spec:
209-
restartPolicy: OnFailure
210-
containers:
211-
- name: cuda-vector-add
212-
# https://github.com/kubernetes/kubernetes/blob/v1.7.11/test/images/nvidia-cuda/Dockerfile
213-
image: "registry.k8s.io/cuda-vector-add:v0.1"
214-
resources:
215-
limits:
216-
nvidia.com/gpu: 1
217-
affinity:
218-
nodeAffinity:
219-
requiredDuringSchedulingIgnoredDuringExecution:
220-
nodeSelectorTerms:
221-
– matchExpressions:
222-
– key: beta.amd.com/gpu.family.AI # Arctic Islands GPU 系列
223-
operator: Exist
224-
```
225-
226-
<!--
227-
This ensures that the Pod will be scheduled to a node that has the GPU type
228-
you specified.
229-
-->
230-
这能够保证 Pod 能够被调度到你所指定类型的 GPU 的节点上去。
148+
会自动给节点打上 GPU 设备属性标签。
231149

150+
对于 NVIDIA GPU,[GPU feature discovery](https://github.com/NVIDIA/gpu-feature-discovery/blob/main/README.md)
151+
提供了类似功能。

0 commit comments

Comments
 (0)