Skip to content

Commit db0175a

Browse files
committed
made multiple fixes to the AI PODs workshop
1 parent d3e2819 commit db0175a

File tree

14 files changed

+64
-20
lines changed

14 files changed

+64
-20
lines changed

content/en/ninja-workshops/14-cisco-ai-pods/6-deploy-llm.md

Lines changed: 27 additions & 6 deletions
Large diffs are not rendered by default.

content/en/ninja-workshops/14-cisco-ai-pods/8-deploy-vector-db.md

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -61,6 +61,14 @@ Let's create a new namespace:
6161
oc create namespace weaviate
6262
```
6363

64+
Run the following command to allow Weaviate to run a privileged container:
65+
66+
> Note: this approach is not recommended for production
67+
68+
``` bash
69+
oc adm policy add-scc-to-user privileged -z default -n weaviate
70+
```
71+
6472
Then deploy Weaviate:
6573

6674
``` bash
@@ -196,7 +204,7 @@ A job is used rather than a pod to ensure that this process runs only once:
196204

197205
``` bash
198206
oc create namespace llm-app
199-
oc apply -f k8s-job.yaml
207+
oc apply -f ./load-embeddings/k8s-job.yaml
200208
```
201209

202210
> Note: to build a Docker image for the Python application that loads the embeddings

content/en/ninja-workshops/14-cisco-ai-pods/9-deploy-llm-app.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,7 @@ Then run the following command to send a question to the LLM:
4343
4444
``` bash
4545
curl -X "POST" \
46-
'http://llm-app.llm-app:8080/askquestion"' \
46+
'http://llm-app.llm-app.svc.cluster.local:8080/askquestion' \
4747
-H 'Accept: application/json' \
4848
-H 'Content-Type: application/json' \
4949
-d '{
@@ -55,7 +55,7 @@ curl -X "POST" \
5555
{{% tab title="Example Output" %}}
5656
5757
``` bash
58-
TBD
58+
The NVIDIA H200 graphics card has 5536 MB of GDDR6 memory.
5959
```
6060
6161
{{% /tab %}}

workshop/cisco-ai-pods/llm-app/app.py

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@
33
import openlit
44

55
from flask import Flask, request
6+
from langchain_nvidia_ai_endpoints import NVIDIAEmbeddings
67
from langchain_nvidia_ai_endpoints import ChatNVIDIA
78
from langchain_core.prompts import ChatPromptTemplate
89
from langchain_core.runnables import RunnablePassthrough
@@ -29,7 +30,7 @@
2930
"You are a helpful and friendly AI!"
3031
"Your responses should be concise and no longer than two sentences."
3132
"Do not hallucinate. Say you don't know if you don't have this information."
32-
# "Answer the question using only the context"
33+
"Answer the question using only the context"
3334
"\n\nQuestion: {question}\n\nContext: {context}"
3435
),
3536
("user", "{question}")
@@ -52,9 +53,11 @@ def ask_question():
5253
)
5354

5455
# connect with the vector store that was populated earlier
55-
vector_store = Weaviate(
56+
vector_store = WeaviateVectorStore(
5657
client=weaviate_client,
57-
embedding=embeddings_model
58+
embedding=embeddings_model,
59+
index_name=None,
60+
text_key="text"
5861
)
5962

6063
chain = (

workshop/cisco-ai-pods/llm-app/k8s-manifest.yaml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -46,11 +46,11 @@ spec:
4646
- name: EMBEDDINGS_MODEL_URL
4747
value: "http://llama-32-nv-embedqa-1b-v2.nim-service:8000/v1"
4848
- name: WEAVIATE_HTTP_HOST
49-
value: "weaviate.weaviate.svc.cluster.local"
49+
value: "weaviate-headless.weaviate.svc.cluster.local"
5050
- name: WEAVIATE_HTTP_PORT
51-
value: "80"
51+
value: "8080"
5252
- name: WEAVIATE_GRPC_HOST
53-
value: "weaviate.weaviate.svc.cluster.local"
53+
value: "weaviate-headless.weaviate.svc.cluster.local"
5454
- name: WEAVIATE_GRPC_PORT
5555
value: "50051"
5656
resources: {}

workshop/cisco-ai-pods/load-embeddings/k8s-job.yaml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -17,11 +17,11 @@ spec:
1717
- name: EMBEDDINGS_MODEL_URL
1818
value: "http://llama-32-nv-embedqa-1b-v2.nim-service:8000/v1"
1919
- name: WEAVIATE_HTTP_HOST
20-
value: "weaviate.weaviate.svc.cluster.local"
20+
value: "weaviate-headless.weaviate.svc.cluster.local"
2121
- name: WEAVIATE_HTTP_PORT
22-
value: "80"
22+
value: "8080"
2323
- name: WEAVIATE_GRPC_HOST
24-
value: "weaviate.weaviate.svc.cluster.local"
24+
value: "weaviate-headless.weaviate.svc.cluster.local"
2525
- name: WEAVIATE_GRPC_PORT
2626
value: "50051"
2727
restartPolicy: Never # Ensure the job only runs once
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
#!/bin/bash
22

3-
oc get csv -n nvidia-gpu-operator gpu-operator-certified.v25.3.0 -ojsonpath={.metadata.annotations.alm-examples} | jq .[0] > clusterpolicy.json
3+
oc get csv -n nvidia-gpu-operator gpu-operator-certified.v25.3.4 -ojsonpath={.metadata.annotations.alm-examples} | jq .[0] > clusterpolicy.json
44
oc apply -f clusterpolicy.json
55

workshop/cisco-ai-pods/nvidia/create-nfd-cr.sh

100644100755
File mode changed.

workshop/cisco-ai-pods/nvidia/install-nfd-operator.sh

100644100755
File mode changed.

workshop/cisco-ai-pods/nvidia/install-nim-operator.sh

100644100755
File mode changed.

0 commit comments

Comments
 (0)