Skip to content

Commit fd6e050

Browse files
Update README.md
1 parent f118709 commit fd6e050

File tree

1 file changed

+4
-4
lines changed

1 file changed

+4
-4
lines changed

README.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -159,7 +159,7 @@ curl -s -o ./topo.xml https://raw.githubusercontent.com/oracle-quickstart/oci-hp
159159
kubectl create configmap nccl-topology --from-file ./topo.xml
160160
```
161161

162-
### Confirm that the GPUs are VFs are correctly exposed
162+
### Confirm that the GPUs are Virtual Functions (VFs) are correctly exposed
163163
Once the Network Operator pods are deployed, the GPU nodes with RDMA NICs will start reporting `nvidia.com/sriov_rdma_vf` as an available resource. You can request that resource in your pod manifests for assigning RDMA VFs to pods.
164164

165165
By default, we create one Virtual Function per Physical Function. So for the H100 and A100 bare metal shapes, you will see 16 VFs per node exposed as a resource.
@@ -173,8 +173,8 @@ NODE GPUs RDMA-VFs
173173
10.79.156.205 8 16
174174
```
175175

176-
### Requesting the Virtual Functions in manifests
177-
Network Operator exposes the RDMA Virtual Functions (VFs) as allocatable resources. In order to use them, you need to add the following annotation to your manifests. The next step in this guide has an example for running the NCCL test, you can use that manifest as an example.
176+
### Requesting VFs in manifests
177+
Network Operator exposes the RDMA Virtual Functions (VFs) as allocatable resources. To use them, you need to add the following annotation to your manifests. The next step in this guide has an example for running the NCCL test, you can use that manifest as an example.
178178

179179
```yaml
180180
template:
@@ -272,4 +272,4 @@ Warning: Permanently added 'nccl-allreduce-job0-mpiworker-1.nccl-allreduce-job0'
272272
# Out of bounds values : 0 OK
273273
# Avg bus bandwidth : 66.4834
274274
#
275-
```
275+
```

0 commit comments

Comments
 (0)