You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: INSTALLING_ONTO_EXISTING_CLUSTER_README.md
+116-3Lines changed: 116 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,6 +6,8 @@ This guide helps you install and use **OCI AI Blueprints** for the first time on
6
6
2. Retrieve existing cluster OKE and VCN names from console.
7
7
3. Deploy the **OCI AI Blueprints** application onto the existing cluster.
8
8
4. Learn how to add existing nodes in the cluster to be used by blueprints.
9
+
5. Deploy a sample recipe to that node.
10
+
6. Test your deployment and undeploy
9
11
10
12
---
11
13
@@ -58,7 +60,118 @@ Some or all of these policies may be in place as required by OKE. Please review
58
60
## Step 4: Add Existing Nodes to Cluster (optional)
59
61
If you have existing node pools in your original OKE cluster that you'd like Blueprints to be able to use, follow these steps after the stack is finished:
60
62
61
-
1. Go to the stack and click "Application information". Click the API Url.
63
+
1. Find the private IP address of the node you'd like to add.
64
+
- Console:
65
+
- Go to the OKE cluster in the console like you did above
66
+
- Click on "Node pools"
67
+
- Click on the pool with the node you want to add
68
+
- Identify the private ip address of the node under "Nodes" in the page.
69
+
- Command line with `kubectl` (assumes cluster access is setup):
70
+
- run `kubectl get nodes`
71
+
- run `kubectl describe node <nodename>` on each node until you find the node you want to add
72
+
- The private ip appears under the `Name` field of the output of `kubectl get nodes`.
73
+
2. Go to the stack and click "Application information". Click the API Url.
62
74
- If you get a warning about security, sometimes it takes a bit for the certificates to get signed. This will go away once that process completes on the OKE side.
63
-
2. Login with the `Admin Username` and `Admin Password` in the Application information tab.
64
-
3.
75
+
3. Login with the `Admin Username` and `Admin Password` in the Application information tab.
76
+
4. Click the link next to "deployment" which will take you to a page with "Deployment List", and a content box.
77
+
5. Paste in the sample blueprint json found [here](./docs/sample_blueprints/add_node_to_control_plane.json).
78
+
6. Modify the "recipe_node_name" field to the private IP address you found in step 1 above.
79
+
7. Click "POST". This is a fast operation.
80
+
8. Wait about 20 seconds and refresh the page. It should look like:
81
+
```json
82
+
[
83
+
{
84
+
"mode": "update",
85
+
"recipe_id": null,
86
+
"creation_date": "2025-03-28 11:12 AM UTC",
87
+
"deployment_uuid": "750a________cc0bfd",
88
+
"deployment_name": "startupaddnode",
89
+
"deployment_status": "completed",
90
+
"deployment_directive": "commission"
91
+
}
92
+
]
93
+
```
94
+
95
+
## Step 5: Deploy a sample recipe
96
+
2. Go to the stack and click "Application information". Click the API Url.
97
+
- If you get a warning about security, sometimes it takes a bit for the certificates to get signed. This will go away once that process completes on the OKE side.
98
+
3. Login with the `Admin Username` and `Admin Password` in the Application information tab.
99
+
4. Click the link next to "deployment" which will take you to a page with "Deployment List", and a content box.
100
+
5. If you added a node from [Step 4](./INSTALLING_ONTO_EXISTING_CLUSTER_README.md#step-4-add-existing-nodes-to-cluster-optional), use the following shared node pool [blueprint](./docs/sample_blueprints/vllm_inference_sample_shared_pool_blueprint.json).
101
+
- Depending on the node shape, you will need to change:
102
+
`"recipe_node_shape": "BM.GPU.A10.4"` to match your shape.
103
+
6. If you did not add a node, or just want to deploy a fresh node, use the following [blueprint](./docs/sample_blueprints/vllm_inference_sample_blueprint.json).
104
+
7. Paste the blueprint you selected into context box on the deployment page and click "POST"
105
+
8. To monitor the deployment, go back to "Api Root" and click "deployment_logs".
106
+
- If you are deploying without a shared node pool, it can take 10-30 minutes to bring up a node, depending on shape and whether it is bare-metal or virtual.
107
+
- If you are deploying with a shared node pool, the blueprint will deploy much more quickly.
108
+
- It is common for a recipe to report "unhealthy" while it is deploying. This is caused by "Warnings" in the pod events when deploying to kubernetes. You only need to be alarmed when an "error" is reported.
109
+
9. Wait for the following steps to complete:
110
+
- Affinity / selection of node -> Directive / commission -> Command / initializing -> Canonical / name assignment -> Service -> Deployment -> Ingress -> Monitor / nominal.
111
+
10. When you see the step "Monitor / nominal", you have an inference server running on your node.
112
+
113
+
## Step 6: Test your deployment
114
+
1. Upon completion of [Step 5](./INSTALLING_ONTO_EXISTING_CLUSTER_README.md#step-5-deploy-a-sample-recipe), test the deployment endpoint.
115
+
2. Go to Api Root, then click "deployment_digests". Find the "service_endpoint_domain" on this page.
116
+
- This is <deployment-name>.<base-url>.nip.io for those who let us deploy the endpoint. If you use the default recipes above, an example of this would be:
"content": "I'm doing well, thank you for asking! I'm a helpful assistant, so I'm always ready to assist you with any questions or tasks you may have. How about you? How's your day going so far?",
147
+
"tool_calls": []
148
+
},
149
+
"logprobs": null,
150
+
"finish_reason": "stop",
151
+
"stop_reason": null
152
+
}
153
+
],
154
+
"usage": {
155
+
"prompt_tokens": 27,
156
+
"total_tokens": 73,
157
+
"completion_tokens": 46,
158
+
"prompt_tokens_details": null
159
+
},
160
+
"prompt_logprobs": null
161
+
}
162
+
```
163
+
5. When completed, undeploy the recipe:
164
+
- go to Api Root -> deployment
165
+
- Grab the whole deployment_uuid field for your deployment.
166
+
- "deployment_uuid": "asdfjklafjdskl"
167
+
- go to Api Root -> undeploy
168
+
- paste the field "deployment_uuid" into the content box and wrap it in curly braces {}:
Copy file name to clipboardExpand all lines: README.md
+1Lines changed: 1 addition & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -16,6 +16,7 @@ Looking to install and use OCI AI Blueprints right away? **[Click here](./GETTIN
16
16
17
17
We recommend following the Getting Started guide if this is your first time.
18
18
19
+
If you are looking to install OCI AI Blueprints onto an existing OKE cluster which already has running workloads and node pools, visit [this doc](./INSTALLING_ONTO_EXISTING_CLUSTER_README.md).
0 commit comments