You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
> -`iluvatar.ai/vcuda-memory` for memory allocation
37
-
> -`iluvatar.ai/vcuda-core` for core allocation
38
-
>
39
-
> You can customize these names using the parameters above.
31
+
**Note:** The currently supported GPU models and resource names are defined in (https://github.com/Project-HAMi/HAMi/blob/master/charts/hami/templates/scheduler/device-configmap.yaml):
32
+
```yaml
33
+
iluvatars:
34
+
- chipName: MR-V100
35
+
commonWord: MR-V100
36
+
resourceCountName: iluvatar.ai/MR-V100-vgpu
37
+
resourceMemoryName: iluvatar.ai/MR-V100.vMem
38
+
resourceCoreName: iluvatar.ai/MR-V100.vCore
39
+
- chipName: MR-V50
40
+
commonWord: MR-V50
41
+
resourceCountName: iluvatar.ai/MR-V50-vgpu
42
+
resourceMemoryName: iluvatar.ai/MR-V50.vMem
43
+
resourceCoreName: iluvatar.ai/MR-V50.vCore
44
+
- chipName: BI-V150
45
+
commonWord: BI-V150
46
+
resourceCountName: iluvatar.ai/BI-V150-vgpu
47
+
resourceMemoryName: iluvatar.ai/BI-V150.vMem
48
+
resourceCoreName: iluvatar.ai/BI-V150.vCore
49
+
- chipName: BI-V100
50
+
commonWord: BI-V100
51
+
resourceCountName: iluvatar.ai/BI-V100-vgpu
52
+
resourceMemoryName: iluvatar.ai/BI-V100.vMem
53
+
resourceCoreName: iluvatar.ai/BI-V100.vCore
54
+
```
40
55
41
56
## Device Granularity
42
57
43
58
HAMi divides each Iluvatar GPU into 100 units for resource allocation. When you request a portion of a GPU, you're actually requesting a certain number of these units.
44
59
45
60
### Memory Allocation
46
61
47
-
- Each unit of `iluvatar.ai/vcuda-memory` represents 256MB of device memory
62
+
- Each unit of `iluvatar.ai/<card-type>.vMem` represents 256MB of device memory
48
63
- If you don't specify a memory request, the system will default to using 100% of the available memory
49
64
- Memory allocation is enforced with hard limits to ensure tasks don't exceed their allocated memory
50
65
51
66
### Core Allocation
52
67
53
-
- Each unit of `iluvatar.ai/vcuda-core` represents 1% of the available compute cores
68
+
- Each unit of `iluvatar.ai/<card-type>.vCore` represents 1% of the available compute cores
54
69
- Core allocation is enforced with hard limits to ensure tasks don't exceed their allocated cores
55
70
- When requesting multiple GPUs, the system will automatically set the core resources based on the number of GPUs requested
56
71
57
72
## Running Iluvatar jobs
58
73
59
74
Iluvatar GPUs can now be requested by a container
60
-
using the `iluvatar.ai/vgpu`, `iluvatar.ai/vcuda-memory` and `iluvatar.ai/vcuda-core` resource type:
75
+
using the `iluvatar.ai/BI-V150-vgpu`, `iluvatar.ai/BI-V150.vMem` and `iluvatar.ai/BI-V150.vCore` resource type:
> **NOTE:** The device ID format is `{node-name}-iluvatar-{index}`. You can find the available device IDs in the node status.
117
-
118
131
### Finding Device UUIDs
119
132
120
133
You can find the UUIDs of Iluvatar GPUs on a node using the following command:
@@ -126,7 +139,7 @@ kubectl get pod <pod-name> -o yaml | grep -A 10 "hami.io/<card-type>-devices-all
126
139
Or by examining the node annotations:
127
140
128
141
```bash
129
-
kubectl get node <node-name> -o yaml | grep -A 10 "hami.io/node-register-<card-type>"
142
+
kubectl get node <node-name> -o yaml | grep -A 10 "hami.io/node-<card-type>-register"
130
143
```
131
144
132
145
Look for annotations containing device information in the node status.
@@ -144,6 +157,6 @@ Look for annotations containing device information in the node status.
144
157
145
158
2. Virtualization takes effect only for containers that apply for one GPU(i.e iluvatar.ai/vgpu=1 ). When requesting multiple GPUs, the system will automatically set the core resources based on the number of GPUs requested.
146
159
147
-
3. The `iluvatar.ai/vcuda-memory` resource is only effective when `iluvatar.ai/vgpu=1`.
160
+
3. The `iluvatar.ai/<card-type>.vMem` resource is only effective when `iluvatar.ai/<card-type>-vgpu=1`.
148
161
149
-
4. Multi-device requests (`iluvatar.ai/vgpu > 1`) do not support vGPU mode.
162
+
4. Multi-device requests (`iluvatar.ai/<card-type>-vgpu= > 1`) do not support vGPU mode.
0 commit comments