You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* full clean up of repo
* fix all broken links
* change name of version folder
* fix links in multi node and rdma readmes
* changes to vllm blueprints
* autoscaling readme changes
* lora_finetuning changes
* cpu_inference
* existing oke cluster
* gpu-health-check
* mig
* model storage
* multinode inference
* shared node pools
* teams
* using rdma enabled node pools
* main README
* Add new blueprints for various AI workloads including autoscaling, CPU inference, GPU health checks, multi-node inference, and shared node pools. Introduced RDMA-enabled node pools for enhanced performance and resource management. Updated documentation for each blueprint to provide comprehensive usage instructions.
* Update documentation for AI blueprints: corrected links for LLM Inference and improved formatting in the features section.
* Update CPU Inference Blueprint documentation: simplified title and clarified purpose for better readability.
* Update sample blueprints documentation: changed 'recipe' to 'blueprint' for consistency and clarity in usage instructions.
Copy file name to clipboardExpand all lines: GETTING_STARTED_README.md
+4-4Lines changed: 4 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -15,8 +15,8 @@ This guide helps you install and use **OCI AI Blueprints** for the first time. Y
15
15
16
16
## Step 1: Set Up Policies in Your Tenancy
17
17
18
-
1. If you are **not** a tenancy administrator, ask your admin to set up the required policies in the **root compartment**. These policies are listed [here](docs/iam_policies/README.md).
19
-
2. If you **are** a tenancy administrator, Resource Manager will typically deploy the minimal required policies automatically, but you can reference the same [IAM policies doc](docs/iam_policies/README.md) for advanced or custom configurations if needed.
18
+
1. If you are **not** a tenancy administrator, ask your admin to set up the required policies in the **root compartment**. These policies are listed [here](docs/iam_policies.md).
19
+
2. If you **are** a tenancy administrator, Resource Manager will typically deploy the minimal required policies automatically, but you can reference the same [IAM policies doc](docs/iam_policies.md) for advanced or custom configurations if needed.
20
20
21
21
---
22
22
@@ -70,7 +70,7 @@ Now that your cluster is ready, follow these steps to install OCI AI Blueprints
70
70
71
71
## Step 5: Access the AI Blueprints API
72
72
73
-
1. Follow the instruction to access the AI Blueprints API via web and/or CURL/Postman: [Ways to Access OCI AI Blueprints](./docs/api_documentation/accessing_oci_ai_blueprints/README.md#ways-to-access-oci-ai-blueprints)
73
+
1. Follow the instruction to access the AI Blueprints API via web and/or CURL/Postman: [Ways to Access OCI AI Blueprints](docs/usage_guide.md)
74
74
75
75
---
76
76
@@ -95,5 +95,5 @@ Following this order ensures you do not have leftover services or dependencies i
95
95
96
96
## Need Help?
97
97
98
-
- Check out [Known Issues & Solutions](docs/known_issues/README.md) for troubleshooting common problems.
98
+
- Check out [Known Issues & Solutions](docs/known_issues.md) for troubleshooting common problems.
Copy file name to clipboardExpand all lines: INSTALLING_ONTO_EXISTING_CLUSTER_README.md
+34-21Lines changed: 34 additions & 21 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -22,7 +22,7 @@ Rather than installing blueprints onto a new cluster, a user may want to leverag
22
22
23
23
## Step 1: Set Up Policies in Your Tenancy
24
24
25
-
Some or all of these policies may be in place as required by OKE. Please review the required policies listed [here](docs/iam_policies/README.md) and add any required policies which are missing.
25
+
Some or all of these policies may be in place as required by OKE. Please review the required policies listed [here](docs/iam_policies.md) and add any required policies which are missing.
26
26
27
27
1. If you are **not** a tenancy administrator, ask your admin to add additional required policies in the **root compartment**.
28
28
2. If you **are** a tenancy administrator, you can either manually add the additional policies to an existing dynamic group, or let the resource manager deploy the required policies during stack creation.
@@ -45,6 +45,7 @@ Some or all of these policies may be in place as required by OKE. Please review
45
45
- Under the section "OCI AI Blueprints IAM", click the checkbox to create the policies. (If you do not see this, ensure you've selected the correct choices for the questions above.)
46
46
47
47
- Otherwise, create the policies if you are an admin, or have your admin create the policies.
48
+
48
49
4. Select "YES" for all other options.
49
50
5. Fill out additional fields for username and password, as well as Home Region.
50
51
6. Under "OKE Cluster & VCN", select the cluster name and vcn name you found in step 2.
@@ -64,8 +65,8 @@ Some or all of these policies may be in place as required by OKE. Please review
64
65
```
65
66
9. After you've added all the relevant tooling namespaces, apply the stack by hitting "Next", then click the "run apply" box.
66
67
67
-
68
68
## Step 4: Add Existing Nodes to Cluster (optional)
69
+
69
70
If you have existing node pools in your original OKE cluster that you'd like Blueprints to be able to use, follow these steps after the stack is finished:
70
71
71
72
1. Find the private IP address of the node you'd like to add.
@@ -82,49 +83,55 @@ If you have existing node pools in your original OKE cluster that you'd like Blu
82
83
- If you get a warning about security, sometimes it takes a bit for the certificates to get signed. This will go away once that process completes on the OKE side.
83
84
3. Login with the `Admin Username` and `Admin Password` in the Application information tab.
84
85
4. Click the link next to "deployment" which will take you to a page with "Deployment List", and a content box.
85
-
5. Paste in the sample blueprint json found [here](./docs/sample_blueprints/add_node_to_control_plane.json).
86
+
5. Paste in the sample blueprint json found [here](docs/sample_blueprints/exisiting_cluster_installation/add_node_to_control_plane.json).
86
87
6. Modify the "recipe_node_name" field to the private IP address you found in step 1 above.
87
88
7. Click "POST". This is a fast operation.
88
89
8. Wait about 20 seconds and refresh the page. It should look like:
90
+
89
91
```json
90
92
[
91
-
{
92
-
"mode": "update",
93
-
"recipe_id": null,
94
-
"creation_date": "2025-03-28 11:12 AM UTC",
95
-
"deployment_uuid": "750a________cc0bfd",
96
-
"deployment_name": "startupaddnode",
97
-
"deployment_status": "completed",
98
-
"deployment_directive": "commission"
99
-
}
93
+
{
94
+
"mode": "update",
95
+
"recipe_id": null,
96
+
"creation_date": "2025-03-28 11:12 AM UTC",
97
+
"deployment_uuid": "750a________cc0bfd",
98
+
"deployment_name": "startupaddnode",
99
+
"deployment_status": "completed",
100
+
"deployment_directive": "commission"
101
+
}
100
102
]
101
103
```
102
104
103
105
## Step 5: Deploy a sample recipe
106
+
104
107
2. Go to the stack and click "Application information". Click the API Url.
105
108
- If you get a warning about security, sometimes it takes a bit for the certificates to get signed. This will go away once that process completes on the OKE side.
106
109
3. Login with the `Admin Username` and `Admin Password` in the Application information tab.
107
110
4. Click the link next to "deployment" which will take you to a page with "Deployment List", and a content box.
108
-
5. If you added a node from [Step 4](./INSTALLING_ONTO_EXISTING_CLUSTER_README.md#step-4-add-existing-nodes-to-cluster-optional), use the following shared node pool [blueprint](./docs/sample_blueprints/vllm_inference_sample_shared_pool_blueprint.json).
111
+
5. If you added a node from [Step 4](./INSTALLING_ONTO_EXISTING_CLUSTER_README.md#step-4-add-existing-nodes-to-cluster-optional), use the following shared node pool [blueprint](./docs/sample_blueprints/shared_node_pools/vllm_inference_sample_shared_pool_blueprint.json).
109
112
- Depending on the node shape, you will need to change:
110
-
`"recipe_node_shape": "BM.GPU.A10.4"` to match your shape.
111
-
6. If you did not add a node, or just want to deploy a fresh node, use the following [blueprint](./docs/sample_blueprints/vllm_inference_sample_blueprint.json).
113
+
`"recipe_node_shape": "BM.GPU.A10.4"` to match your shape.
114
+
6. If you did not add a node, or just want to deploy a fresh node, use the following [blueprint](docs/sample_blueprints/llm_inference_with_vllm/vllm-open-hf-model.json).
112
115
7. Paste the blueprint you selected into context box on the deployment page and click "POST"
113
116
8. To monitor the deployment, go back to "Api Root" and click "deployment_logs".
114
117
- If you are deploying without a shared node pool, it can take 10-30 minutes to bring up a node, depending on shape and whether it is bare-metal or virtual.
115
118
- If you are deploying with a shared node pool, the blueprint will deploy much more quickly.
116
119
- It is common for a recipe to report "unhealthy" while it is deploying. This is caused by "Warnings" in the pod events when deploying to kubernetes. You only need to be alarmed when an "error" is reported.
117
-
9. Wait for the following steps to complete:
120
+
9. Wait for the following steps to complete:
118
121
- Affinity / selection of node -> Directive / commission -> Command / initializing -> Canonical / name assignment -> Service -> Deployment -> Ingress -> Monitor / nominal.
119
122
10. When you see the step "Monitor / nominal", you have an inference server running on your node.
120
123
121
124
## Step 6: Test your deployment
125
+
122
126
1. Upon completion of [Step 5](./INSTALLING_ONTO_EXISTING_CLUSTER_README.md#step-5-deploy-a-sample-recipe), test the deployment endpoint.
123
127
2. Go to Api Root, then click "deployment_digests". Find the "service_endpoint_domain" on this page.
128
+
124
129
- This is <deployment-name>.<base-url>.nip.io for those who let us deploy the endpoint. If you use the default recipes above, an example of this would be:
Destroying the OCI AI Blueprints stack will not destroy any resources which were created or destroyed outside of the stack such as node pools or helm installs. Only things created by the stack will be destroyed for the stack. To destroy the stack:
185
196
186
197
1. Go to the console and navigate to Developer Services -> Resource Manager -> Stacks -> Your OCI AI Blueprints stack
187
198
2. Click "Destroy" at the top
188
199
189
200
## Multi-Instance GPU Setup
201
+
190
202
If you have the nvidia gpu operator already installed, and would like to reconfigure it because you plan on using Multi-Instance GPUs (MIG) with your H100 nodes, you will need to manually update / reconfigure your cluster with helm.
191
203
192
204
This can be done like below:
205
+
193
206
```bash
194
207
# Get the deployment name
195
208
helm list -n gpu-operator
196
209
197
210
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
Copy file name to clipboardExpand all lines: README.md
+31-23Lines changed: 31 additions & 23 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,58 +1,66 @@
1
1
# OCI AI Blueprints
2
+
2
3
**Deploy, scale, and monitor AI workloads with the OCI AI Blueprints platform, and reduce your GPU onboarding time from weeks to minutes.**
3
4
4
5
OCI AI Blueprints is a streamlined, no-code solution for deploying and managing Generative AI workloads on Kubernetes Engine (OKE). By providing opinionated hardware recommendations, pre-packaged software stacks, and out-of-the-box observability tooling, OCI AI Blueprints helps you get your AI applications running quickly and efficiently—without wrestling with the complexities of infrastructure decisions, software compatibility, and MLOps best practices.
5
6
6
-
[](./GETTING_STARTED_README.md)
7
+
[](./GETTING_STARTED_README.md)
7
8
8
9
## Table of Contents
10
+
9
11
**Getting Started**
12
+
10
13
-[Install AI Blueprints](./GETTING_STARTED_README.md)
11
-
-[Access AI Blueprints Portal and API](./docs/api_documentation/accessing_oci_ai_blueprints/README.md)
14
+
-[Access AI Blueprints Portal and API](docs/usage_guide.md)
12
15
13
16
**About OCI AI Blueprints**
14
-
-[What is OCI AI Blueprints?](./docs/about/README.md#what-is-oci-ai-blueprints)
15
-
-[Why use OCI AI Blueprints?](./docs/about/README.md#why-use-oci-ai-blueprints)
Install OCI AI Blueprints by clicking on the button below:
33
40
34
-
[](./GETTING_STARTED_README.md)
41
+
[](./GETTING_STARTED_README.md)
35
42
36
43
## Blueprints
37
44
38
45
Blueprints go beyond basic Terraform templates. Each blueprint:
|[**LLM & VLM Inference with vLLM**](./docs/sample_blueprints/vllm-inference)| Deploy Llama 2/3/3.1 7B/8B models using NVIDIA GPU shapes and the vLLM inference engine with auto-scaling. |
48
-
|[**Fine-Tuning Benchmarking**](./docs/sample_blueprints/lora-benchmarking)| Run MLCommons quantized Llama-2 70B LoRA finetuning on A100 for performance benchmarking. |
49
-
|[**LoRA Fine-Tuning**](./docs/sample_blueprints/lora-fine-tuning)| LoRA fine-tuning of custom or HuggingFace models using any dataset. Includes flexible hyperparameter tuning. |
50
-
|[**Health Check**](./docs/sample_blueprints/gpu-health-check)| Comprehensive evaluation of GPU performance to ensure optimal hardware readiness before initiating any intensive computational workload.|
51
-
|[**CPU Inference**](./docs/sample_blueprints/cpu-inference)| Leverage Ollama to test CPU-based inference with models like Mistral, Gemma, and more. |
52
-
|[**Multi-node Inference with RDMA and vLLM**](./docs/multi_node_inference)| Deploy Llama-405B sized LLMs across multiple nodes with RDMA using H100 nodes with vLLM and LeaderWorkerSet.|
53
-
|[**Scaled Inference with vLLM**](./docs/auto_scaling)| Serve LLMs with auto-scaling using KEDA, which scales to multiple GPUs and nodes using application metrics like inference latency.|
54
-
|[**LLM Inference with MIG**](./docs/mig_multi_instance_gpu)| Deploy LLMs to a fraction of a GPU with Nvidia’s multi-instance GPUs and serve them with vLLM.|
55
-
|[**Job Queuing**](./docs/sample_blueprints/teams)| Take advantage of job queuing and enforce resource quotas and fair sharing between teams. |
|[**LLM & VLM Inference with vLLM**](docs/sample_blueprints/llm_inference_with_vllm/README.md)| Deploy Llama 2/3/3.1 7B/8B models using NVIDIA GPU shapes and the vLLM inference engine with auto-scaling.|
56
+
|[**Fine-Tuning Benchmarking**](./docs/sample_blueprints/lora-benchmarking)| Run MLCommons quantized Llama-2 70B LoRA finetuning on A100 for performance benchmarking.|
57
+
|[**LoRA Fine-Tuning**](./docs/sample_blueprints/lora-fine-tuning)| LoRA fine-tuning of custom or HuggingFace models using any dataset. Includes flexible hyperparameter tuning.|
58
+
|[**Health Check**](./docs/sample_blueprints/gpu-health-check)| Comprehensive evaluation of GPU performance to ensure optimal hardware readiness before initiating any intensive computational workload.|
59
+
|[**CPU Inference**](./docs/sample_blueprints/cpu-inference)| Leverage Ollama to test CPU-based inference with models like Mistral, Gemma, and more.|
60
+
|[**Multi-node Inference with RDMA and vLLM**](./docs/sample_blueprints/multi-node-inference/)| Deploy Llama-405B sized LLMs across multiple nodes with RDMA using H100 nodes with vLLM and LeaderWorkerSet. |
61
+
|[**Autoscaling Inference with vLLM**](./docs/sample_blueprints/auto_scaling/)| Serve LLMs with auto-scaling using KEDA, which scales to multiple GPUs and nodes using application metrics like inference latency.|
62
+
|[**LLM Inference with MIG**](./docs/sample_blueprints/mig_multi_instance_gpu/)| Deploy LLMs to a fraction of a GPU with Nvidia’s multi-instance GPUs and serve them with vLLM. |
63
+
|[**Job Queuing**](./docs/sample_blueprints/teams)| Take advantage of job queuing and enforce resource quotas and fair sharing between teams.|
0 commit comments