You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -52,7 +52,7 @@ After you install OCI AI Blueprints to an OKE cluster in your tenancy, you can d
52
52
|[**Multi-node Inference with RDMA and vLLM**](./docs/multi_node_inference)| Deploy Llama-405B sized LLMs across multiple nodes with RDMA using H100 nodes with vLLM and LeaderWorkerSet. |
53
53
|[**Scaled Inference with vLLM**](./docs/auto_scaling)| Serve LLMs with auto-scaling using KEDA, which scales to multiple GPUs and nodes using application metrics like inference latency.|
54
54
|[**LLM Inference with MIG**](./docs/mig_multi_instance_gpu)| Deploy LLMs to a fraction of a GPU with Nvidia’s multi-instance GPUs and serve them with vLLM. |
55
-
|[**Health Check**](./docs/sample_blueprints/gpu-health-check)|Comprehensive evaluation of GPU performance to ensure optimal hardware readiness before initiating any intensive computational workload. |
55
+
|[**Job Queuing**](./docs/sample_blueprints/teams)|Take advantage of job queuing and enforce resource quotas and fair sharing between teams. |
0 commit comments