Skip to content

Commit cc56cd0

Browse files
Add 24.9.2
1 parent d9fb65a commit cc56cd0

File tree

3 files changed

+264
-2
lines changed

3 files changed

+264
-2
lines changed

README.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -25,12 +25,14 @@ You can use the below image for both CPU and GPU pools.
2525
#### Image to import and use for the H100 and A100 nodes
2626
You can use the instructions [here.](https://docs.oracle.com/en-us/iaas/Content/Compute/Tasks/imageimportexport.htm#Importing) for importing the below image to your tenancy.
2727

28-
[Image to import](https://objectstorage.ca-toronto-1.oraclecloud.com/p/oXC6BcCkB0lXhycxV-0UuDqGGnVtFWfLOkwuJWA5WbsBDb4FkHwnsOHa_ElRcfL2/n/hpc_limited_availability/b/images/o/Ubuntu-22-OCA-OFED-23.10-2.1.3.1-GPU-535-CUDA-12.2-2024.03.15-0)
28+
[GPU driver v535 with CUDA 12.2](https://objectstorage.ca-toronto-1.oraclecloud.com/p/KOcEZeDpEAASLSKzumODnVr42mFwM_p9n1_Nra2FsV_F6BcpAkoH66HZxN4cCtIb/n/hpc_limited_availability/b/images/o/Ubuntu-22-OCA-OFED-23.10-2.1.3.1-GPU-535-CUDA-12.2-2024.09.18-0)
29+
30+
[GPU driver v550 with CUDA 12.4](https://objectstorage.ca-toronto-1.oraclecloud.com/p/EDngSWYfn3HjrN0xbfBSVCctRVKVvNf3NOW7DdInKMtgiZwiUqy7PsA_xifmI1oq/n/hpc_limited_availability/b/images/o/Ubuntu-22-OCA-OFED-23.10-2.1.3.1-GPU-550-CUDA-12.4-2024.09.18-0)
2931

3032
### Deploy the cluster using the Oracle Cloud Resource Manager template
3133
You can easily deploy the cluster using the **Deploy to Oracle Cloud** button below.
3234

33-
[![Deploy to Oracle Cloud](https://oci-resourcemanager-plugin.plugins.oci.oraclecloud.com/latest/deploy-to-oracle-cloud.svg)](https://cloud.oracle.com/resourcemanager/stacks/create?zipUrl=https://github.com/oracle-quickstart/oci-hpc-oke/releases/download/v24.9.1/oke-rdma-quickstart-v24.9.1.zip)
35+
[![Deploy to Oracle Cloud](https://oci-resourcemanager-plugin.plugins.oci.oraclecloud.com/latest/deploy-to-oracle-cloud.svg)](https://cloud.oracle.com/resourcemanager/stacks/create?zipUrl=https://github.com/oracle-quickstart/oci-hpc-oke/releases/download/v24.9.2/oke-rdma-quickstart-v24.9.2.zip)
3436

3537
For the image ID, use the ID of the image that you imported in the previous step.
3638

docs/running-ib-write-bw-test.md

Lines changed: 219 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,219 @@
1+
## Running `ib_write_bw` test using RDMA CM between two nodes in OKE
2+
### 1 - Deploy the RDMA test pods
3+
Apply the following manifest to deploy 2 test pods (`rdma-test-pod-1` & `rdma-test-pod-2`).
4+
5+
> [!IMPORTANT]
6+
> Below manifest assumes you have all your RDMA enabled nodes in the same cluster network. If you have multiple cluster networks, choose the correct `nodeSelectorTerms` accordingly.
7+
8+
```yaml
9+
apiVersion: v1
10+
kind: Pod
11+
metadata:
12+
name: rdma-test-pod-1
13+
labels:
14+
app: rdma-test-pods
15+
spec:
16+
hostNetwork: true
17+
tolerations:
18+
- key: nvidia.com/gpu
19+
operator: Exists
20+
affinity:
21+
nodeAffinity:
22+
requiredDuringSchedulingIgnoredDuringExecution:
23+
nodeSelectorTerms:
24+
- matchExpressions:
25+
- key: node.kubernetes.io/instance-type
26+
operator: In
27+
values:
28+
- BM.GPU.A100-v2.8
29+
- BM.GPU.B4.8
30+
- BM.GPU4.8
31+
- BM.GPU.H100.8
32+
- BM.Optimized3.36
33+
- BM.HPC.E5.144
34+
- BM.HPC2.36
35+
topologySpreadConstraints:
36+
- maxSkew: 1
37+
topologyKey: kubernetes.io/hostname
38+
whenUnsatisfiable: DoNotSchedule
39+
labelSelector:
40+
matchLabels:
41+
app: rdma-test-pods
42+
dnsPolicy: ClusterFirstWithHostNet
43+
volumes:
44+
- { name: devinf, hostPath: { path: /dev/infiniband }}
45+
- { name: shm, emptyDir: { medium: Memory, sizeLimit: 32Gi }}
46+
restartPolicy: OnFailure
47+
containers:
48+
- image: oguzpastirmaci/mofed-perftest:5.4-3.6.8.1-ubuntu20.04-amd64
49+
name: mofed-test-ctr
50+
securityContext:
51+
privileged: true
52+
capabilities:
53+
add: [ "IPC_LOCK" ]
54+
volumeMounts:
55+
- { mountPath: /dev/infiniband, name: devinf }
56+
- { mountPath: /dev/shm, name: shm }
57+
resources:
58+
requests:
59+
cpu: 8
60+
ephemeral-storage: 32Gi
61+
memory: 2Gi
62+
command:
63+
- sh
64+
- -c
65+
- |
66+
ls -l /dev/infiniband /sys/class/net
67+
sleep 1000000
68+
---
69+
apiVersion: v1
70+
kind: Pod
71+
metadata:
72+
name: rdma-test-pod-2
73+
labels:
74+
app: rdma-test-pods
75+
spec:
76+
hostNetwork: true
77+
tolerations:
78+
- key: nvidia.com/gpu
79+
operator: Exists
80+
affinity:
81+
nodeAffinity:
82+
requiredDuringSchedulingIgnoredDuringExecution:
83+
nodeSelectorTerms:
84+
- matchExpressions:
85+
- key: node.kubernetes.io/instance-type
86+
operator: In
87+
values:
88+
- BM.GPU.A100-v2.8
89+
- BM.GPU.B4.8
90+
- BM.GPU4.8
91+
- BM.GPU.H100.8
92+
- BM.Optimized3.36
93+
- BM.HPC.E5.144
94+
- BM.HPC2.36
95+
topologySpreadConstraints:
96+
- maxSkew: 1
97+
topologyKey: kubernetes.io/hostname
98+
whenUnsatisfiable: DoNotSchedule
99+
labelSelector:
100+
matchLabels:
101+
app: rdma-test-pods
102+
dnsPolicy: ClusterFirstWithHostNet
103+
volumes:
104+
- { name: devinf, hostPath: { path: /dev/infiniband }}
105+
- { name: shm, emptyDir: { medium: Memory, sizeLimit: 32Gi }}
106+
restartPolicy: OnFailure
107+
containers:
108+
- image: oguzpastirmaci/mofed-perftest:5.4-3.6.8.1-ubuntu20.04-amd64
109+
name: mofed-test-ctr
110+
securityContext:
111+
privileged: true
112+
capabilities:
113+
add: [ "IPC_LOCK" ]
114+
volumeMounts:
115+
- { mountPath: /dev/infiniband, name: devinf }
116+
- { mountPath: /dev/shm, name: shm }
117+
resources:
118+
requests:
119+
cpu: 8
120+
ephemeral-storage: 32Gi
121+
memory: 2Gi
122+
command:
123+
- sh
124+
- -c
125+
- |
126+
ls -l /dev/infiniband /sys/class/net
127+
sleep 1000000
128+
```
129+
130+
```
131+
kubectl get pods
132+
133+
NAME READY STATUS RESTARTS AGE
134+
rdma-test-pod-1 1/1 Running 0 64m
135+
rdma-test-pod-2 1/1 Running 0 64m
136+
```
137+
138+
### 2 - Exec into the test pods in separate terminals
139+
Exec into the test pods, and run the following commands to run a test with `ib_write_bw` using RDMA CM.
140+
141+
#### rdma-test-pod-1
142+
You will use this pod as the server for `ib_write_bw`.
143+
144+
Run the following commands. It will show the IP that you will use in the next other pod and start the `ib_write_bw` server.
145+
146+
```
147+
MLX_DEVICE_NAME=$(ibdev2netdev | grep rdma0 | awk '{print $1}')
148+
RDMA0_IP=$(ip -f inet addr show rdma0 | sed -En -e 's/.*inet ([0-9.]+).*/\1/p')
149+
150+
echo -e "\nThe IP of RDMA0 is to use in rdma-test-pod-2 is: $RDMA0_IP\n"
151+
152+
ib_write_bw -F -x 3 --report_gbits -R -T 41 -q 4 -d $MLX_DEVICE_NAME
153+
```
154+
155+
Example output:
156+
```
157+
The IP of RDMA0 is to use in rdma-test-pod-2 is: 10.224.5.57
158+
159+
$ ib_write_bw -F -x 3 --report_gbits -R -T 41 -q 4 -d $MLX_DEVICE_NAME
160+
161+
************************************
162+
* Waiting for client to connect... *
163+
************************************
164+
```
165+
166+
#### rdma-test-pod-2
167+
You will use this pod as the client for `ib_write_bw`.
168+
169+
Run the following commands to start the test. Make sure you change the first command with the IP you have from the above step.
170+
171+
```
172+
RDMA0_IP_OF_POD1=<ENTER THE IP FROM THE PREVIOUS STEP>
173+
174+
MLX_DEVICE_NAME=$(ibdev2netdev | grep rdma0 | awk '{print $1}')
175+
176+
ib_write_bw -F -x 3 --report_gbits -R -T 41 -q 4 -d $MLX_DEVICE_NAME $RDMA0_IP_OF_POD1
177+
```
178+
179+
Example output:
180+
```
181+
$ RDMA0_IP_OF_POD1=10.224.5.57
182+
$ MLX_DEVICE_NAME=$(ibdev2netdev | grep rdma0 | awk '{print $1}')
183+
184+
$ ib_write_bw -F -x 3 --report_gbits -R -T 41 -q 4 -d $MLX_DEVICE_NAME $RDMA0_IP_OF_POD1
185+
---------------------------------------------------------------------------------------
186+
RDMA_Write BW Test
187+
Dual-port : OFF Device : mlx5_5
188+
Number of qps : 4 Transport type : IB
189+
Connection type : RC Using SRQ : OFF
190+
TX depth : 128
191+
CQ Moderation : 100
192+
Mtu : 4096[B]
193+
Link type : Ethernet
194+
GID index : 3
195+
Max inline data : 0[B]
196+
rdma_cm QPs : ON
197+
Data ex. method : rdma_cm TOS : 41
198+
---------------------------------------------------------------------------------------
199+
local address: LID 0000 QPN 0x0093 PSN 0xbf9bfe
200+
GID: 00:00:00:00:00:00:00:00:00:00:255:255:10:224:04:233
201+
local address: LID 0000 QPN 0x0094 PSN 0xab0910
202+
GID: 00:00:00:00:00:00:00:00:00:00:255:255:10:224:04:233
203+
local address: LID 0000 QPN 0x0095 PSN 0x28bd1a
204+
GID: 00:00:00:00:00:00:00:00:00:00:255:255:10:224:04:233
205+
local address: LID 0000 QPN 0x0096 PSN 0x5c7f61
206+
GID: 00:00:00:00:00:00:00:00:00:00:255:255:10:224:04:233
207+
remote address: LID 0000 QPN 0x0093 PSN 0x62655e
208+
GID: 00:00:00:00:00:00:00:00:00:00:255:255:10:224:05:57
209+
remote address: LID 0000 QPN 0x0094 PSN 0x6706f0
210+
GID: 00:00:00:00:00:00:00:00:00:00:255:255:10:224:05:57
211+
remote address: LID 0000 QPN 0x0095 PSN 0xcb157a
212+
GID: 00:00:00:00:00:00:00:00:00:00:255:255:10:224:05:57
213+
remote address: LID 0000 QPN 0x0096 PSN 0x626041
214+
GID: 00:00:00:00:00:00:00:00:00:00:255:255:10:224:05:57
215+
---------------------------------------------------------------------------------------
216+
#bytes #iterations BW peak[Gb/sec] BW average[Gb/sec] MsgRate[Mpps]
217+
65536 20000 98.01 98.01 0.186932
218+
---------------------------------------------------------------------------------------
219+
```

files/oke-ubuntu-cloud-init.sh

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
#!/bin/bash
2+
set -x
3+
4+
distrib_codename=$(lsb_release -c -s)
5+
kubernetes_version=$1
6+
oke_package_version="${kubernetes_version:1}"
7+
oke_package_repo_version="${oke_package_version:0:4}"
8+
oke_package_name="oci-oke-node-all-$oke_package_version"
9+
oke_package_repo="https://odx-oke.objectstorage.us-sanjose-1.oci.customer-oci.com/n/odx-oke/b/okn-repositories/o/prod/ubuntu-$distrib_codename/kubernetes-$oke_package_repo_version"
10+
11+
# Add OKE Ubuntu package repo
12+
add-apt-repository -y "deb [trusted=yes] $oke_package_repo stable main"
13+
14+
# Wait for apt lock and install the package
15+
while fuser /var/{lib/{dpkg,apt/lists},cache/apt/archives}/lock >/dev/null 2>&1; do
16+
sleep 1
17+
done
18+
19+
apt-get -y update
20+
21+
apt-get -y install $oke_package_name
22+
23+
# TEMPORARY REQUIREMENT: Edit storage.conf to use the first Nvme drive (if it exists) for container images
24+
cat <<EOF > /etc/containers/storage.conf
25+
[storage]
26+
# Default storage driver
27+
driver = "overlay"
28+
# Temporary storage location
29+
runroot = "/var/run/containers/storage"
30+
# Primary read/write location of container storage
31+
graphroot = "/var/lib/oke-crio"
32+
EOF
33+
34+
# TEMPORARY REQUIREMENT: Edit registries.conf to add unqualified registries
35+
cat <<EOF > /etc/containers/registries.conf
36+
unqualified-search-registries = ["container-registry.oracle.com", "docker.io"]
37+
short-name-mode = "permissive"
38+
EOF
39+
40+
# OKE bootstrap
41+
oke bootstrap --manage-gpu-services

0 commit comments

Comments
 (0)