Skip to content

Commit 119be77

Browse files
Merge pull request #10 from OguzPastirmaci/main
Update NCCL manifests
2 parents 714b02e + 7200cbc commit 119be77

File tree

2 files changed

+6
-0
lines changed

2 files changed

+6
-0
lines changed

manifests/a100-nccl-test.yaml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -43,6 +43,9 @@ spec:
4343
name: wait-for-workers
4444
serviceAccount: mpi-worker-view
4545
terminationGracePeriodSeconds: 2
46+
tolerations:
47+
- key: nvidia.com/gpu
48+
operator: Exists
4649
containers:
4750
- command:
4851
- /bin/bash

manifests/h100-nccl-test.yaml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -43,6 +43,9 @@ spec:
4343
name: wait-for-workers
4444
serviceAccount: mpi-worker-view
4545
terminationGracePeriodSeconds: 2
46+
tolerations:
47+
- key: nvidia.com/gpu
48+
operator: Exists
4649
containers:
4750
- command:
4851
- /bin/bash

0 commit comments

Comments
 (0)