|
| 1 | +# Deploy NVIDIA NeMo microservices on Oracle Kubernetes Engine (OKE) |
| 2 | + |
| 3 | +**Summary:** The following tutorial will take you through the requisite steps for deploying and configuring [NVIDIA NeMo Microservices](https://www.nvidia.com/en-us/ai-data-science/products/nemo/) on OCI. The deployment will use OKE (managed Kubernetes) and will utilize Oracle Database 23ai for both structured data and vector data store. |
| 4 | + |
| 5 | +<u>Requirements</u> |
| 6 | + |
| 7 | +* An [NVIDIA NGC account](https://org.ngc.nvidia.com/setup/personal-keys) where you can provision an API key. |
| 8 | +* An Oracle Cloud Infrastructure (OCI) paid account with access to GPU shapes. NVIDIA A10 will be sufficient. |
| 9 | +* General understanding of Python and Jupyter Notebooks |
| 10 | + |
| 11 | +## Task 1: Collect and configure prerequisites |
| 12 | + |
| 13 | +1. Generate an NGC API Key via the NVIDIA portal. |
| 14 | + |
| 15 | +  |
| 16 | + |
| 17 | +2. Log into your [Oracle Cloud](https://cloud.oracle.com) account. |
| 18 | + |
| 19 | +3. Using the menu in the top left corner, navigate to **`Developer Services`** -> **`Kubernetes Clusters (OKE)`** |
| 20 | + |
| 21 | +4. Click **`[Create cluster]`** and choose the **Quick create** option. Click **`[Submit]`** |
| 22 | + |
| 23 | +  |
| 24 | + |
| 25 | +5. Provide the following confniguration details for your cluster: |
| 26 | + |
| 27 | + * Name |
| 28 | + * Kubernetes Endpoint: Public endpoint |
| 29 | + * Node type: Managed |
| 30 | + * Kubernetes worker nodes: Private workers |
| 31 | + * Shape: VM. Standard.E3.Flex (or E4 | E5, depending on your available capacity) |
| 32 | + * Select the number of OCPUs: 2 or more |
| 33 | + * Node count: 1 |
| 34 | + |
| 35 | + >Note: After the cluster is online, we'll provision a second node pool with GPU shapes. The *E#* flex shapes will be used for cluster operations and the Oracle Database 23ai deployment. |
| 36 | +
|
| 37 | +6. Click **`[Next]`**, validate the settings, then click **`[Create cluster]`**. |
| 38 | + |
| 39 | + >Note: The cluster creation process will take around 15 minutes. |
| 40 | +
|
| 41 | +7. Once the cluster is **Active** click the cluster name to view details. Use the navigation menu in the left pane to locate, then click **Node pools** |
| 42 | + |
| 43 | +8. You should see **pool1** that was automatically provisioned with the cluster. Click **`[Add node pool]`**. |
| 44 | + |
| 45 | +9. Provide the following configuration parameters: |
| 46 | + |
| 47 | + * Name |
| 48 | + * Node Placement Configuration: |
| 49 | + * Availability domain: select at least 1 |
| 50 | + * Worker node subnet: select the *node* subnet |
| 51 | + * Node shape: An NVIDIA GPU shape. VM.GPU.A10.1 will work. |
| 52 | + * Node count: 3 |
| 53 | + * Click **Specify a custom boot volume size and change the value to 250. |
| 54 | + * Click the very last **Show advanced options**, found just above the **`[Add]`** button. Under **Initialization script** choose **Paste Cloud-Init Script and enter the following: |
| 55 | + |
| 56 | + ```bash |
| 57 | + <copy> |
| 58 | + #!/bin/bash |
| 59 | + curl --fail -H "Authorization: Bearer Oracle" -L0 http://169.254.169.254/opc/v2/instance/metadata/oke_init_script | base64 --decode >/var/run/oke-init.sh |
| 60 | + bash /var/run/oke-init.sh |
| 61 | + bash /usr/libexec/oci-growfs -y |
| 62 | + systemctl restart kubelet.service |
| 63 | + </copy> |
| 64 | + ``` |
| 65 | + |
| 66 | + >Note: This deployment requires 3 GPUs to function properly. You can either deploy 3 separate single-GPU nodes, or a single node with 4+ GPUs. |
| 67 | + |
| 68 | +10. Click **`[Add]`** to create the new node pool. |
| 69 | + |
| 70 | +11. While that is creating, return to the **Cluster details** page and click the **`[Access Cluster]`** at the top of the page. |
| 71 | + |
| 72 | +12. In the dialog that opens, click the button to **`[Launch Cloud Shell]`**, then copy the command found in step 2. When Cloud Shell becomes available, paste and run the command. |
| 73 | + |
| 74 | +  |
| 75 | + |
| 76 | +13. The command you just executed will create your Kube config file. To test it, run the following: |
| 77 | + |
| 78 | + ```bash |
| 79 | + <copy> |
| 80 | + kubectl cluster-info |
| 81 | + kubectl get nodes -o wide |
| 82 | + </copy> |
| 83 | + ``` |
| 84 | + |
| 85 | + >Note: The GPU nodes may still be provisioning and might not show up just yet. The node name is its private IP address. |
| 86 | + |
| 87 | +14. Finally, on the Cluster details page, locate the **Add-ons** link and click it. Click **`[Manage add-ons]`** and enable the following: |
| 88 | + |
| 89 | + * Certificate Manager |
| 90 | + * Databaes Operator |
| 91 | + * Metrics Server |
| 92 | + |
| 93 | + >Note: Enable them on at a time by clicking the box, checking the **Enable** option, and saving the changes. |
| 94 | + |
| 95 | + |
| 96 | +## Task 2: Install JupyterHub |
| 97 | + |
| 98 | +1. Return to Cloud Shell. Create a new file called **jh-values.yaml** and paste the following: |
| 99 | + |
| 100 | + ``` |
| 101 | + <copy> |
| 102 | + # default configuration |
| 103 | + singleuser: |
| 104 | + cloudMetadata: |
| 105 | + blockWithIptables: false |
| 106 | + # optional – if you want to spawn GPU-based user notebooks, remove the comment character from the following lines. |
| 107 | + #profileList: |
| 108 | + # - display_name: "GPU Server" |
| 109 | + # description: "Spawns a notebook server with access to a GPU" |
| 110 | + # kubespawner_override: |
| 111 | + # extra_resource_limits: |
| 112 | + # nvidia.com/gpu: "1" |
| 113 | + </copy> |
| 114 | + ``` |
| 115 | + |
| 116 | + >Note: In this tutorial we use Jupyter notebooks to interact with the GPU-driven NVIDIA microservices. You will not need to enable GPU-based user notebooks to complete the tasks herein. |
| 117 | + |
| 118 | +2. Add the Helm repo. |
| 119 | + |
| 120 | + ```bash |
| 121 | + <copy> |
| 122 | + helm repo add jupyterhub https://hub.jupyter.org/helm-chart/ && helm repo update |
| 123 | + </copy> |
| 124 | + ``` |
| 125 | + |
| 126 | +3. Perform the install using Helm, and reference the values file created in step 1. |
| 127 | + |
| 128 | + ```bash |
| 129 | + <copy> |
| 130 | + helm upgrade --cleanup-on-fail –install jupyter-hub jupyterhub/jupyterhub --namespace k8s-jupyter --create-namespace --values jh-values.yaml |
| 131 | + </copy> |
| 132 | + ``` |
| 133 | + |
| 134 | +4. Once the deployment is complete, the Kubernetes service that gets created will provision an OCI Load Balancer for public access. Locate the public IP address of the load balancer and store it for later. |
| 135 | + |
| 136 | + ```bash |
| 137 | + <copy> |
| 138 | + kubectl get svc -n k8s-jupyter |
| 139 | + </copy> |
| 140 | + ``` |
| 141 | + |
| 142 | + Output: |
| 143 | + ```bash |
| 144 | + NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) |
| 145 | + k8s-jupyter proxy-public LoadBalancer 10.96.177.9 129.213.1.77 80:30141/TCP |
| 146 | + ``` |
| 147 | + |
| 148 | +5. When you access the JupyterHub UI for the first time, you will be prompted for a username and password. Specify values of your choosing but make sure you safe them for future use. After logging in, you'll need to click the button to start the server. The startup process will take 5-7 minutes. |
| 149 | +
|
| 150 | +## Task 3: Deploy the Oracle Database 23ai pod |
| 151 | +
|
| 152 | +1. |
| 153 | +
|
| 154 | +
|
| 155 | +
|
| 156 | +
|
0 commit comments