You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,6 +6,6 @@ Please visit OKE documentation page for more information: https://docs.oracle.co
6
6
7
7
This repository will focus on two workload types using GPUs: RDMA workloads using OCI's high performance network with support for RDMA (e.g. training jobs) and non-RDMA workloads that don't need to use the RDMA network (e.g. inference jobs).
8
8
9
-
### [Running RDMA workloads on OKE](./docs/running-rdma-workloads-on-oke.md)
10
-
11
-
### [Running non-RDMA workloads on OKE](./docs/running-non-rdma-workloads-on-oke.md)
Copy file name to clipboardExpand all lines: docs/running-rdma-workloads-on-oke-a100.md
+13-19Lines changed: 13 additions & 19 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -61,18 +61,6 @@ NAME STATUS ROLES AGE VERSION
61
61
10.0.96.81 Ready node 2d23h v1.25.6
62
62
```
63
63
64
-
### Deploy the OCI RDMA Health Check daemonset
65
-
> [!IMPORTANT]
66
-
> Deploying this daemonset is important.
67
-
> When a new node joins to the OKE cluster, it will report itself as ready. However, the RDMA network configuration of the nodes usually takes longer than the node joining the cluster. The health check daemonset checks the status of the RDMA interfaces, and removes the `oci.oraclecloud.com/oci-rdma-health-check` that is being added via cloud init.
0 commit comments