Merge pull request #1552 from JoeStech/rag-axion-lp-edits

pareenaverma · web-flow · commit 0bb626ae71fe · 2025-01-22T16:20:37.000-05:00
For the Axion RAG LP, change graviton references to axion and show how to set up network rules
diff --git a/content/learning-paths/servers-and-cloud-computing/rag/_index.md b/content/learning-paths/servers-and-cloud-computing/rag/_index.md
@@ -1,5 +1,5 @@
 ---
-title: Deploy a RAG-based Chatbot with llama-cpp-python using KleidiAI on Arm Servers
+title: Deploy a RAG-based Chatbot with llama-cpp-python using KleidiAI on Google Axion processors
 
 minutes_to_complete: 45
 
@@ -13,6 +13,7 @@ learning_objectives:
     - Monitor and analyze inference performance metrics.
 
 prerequisites:
+    - A Google Cloud Axion (or other Arm) compute instance with at least 16 cores, 8GB of RAM, and 32GB disk space.
     - Basic understanding of Python and ML concepts.
     - Familiarity with REST APIs and web services.
     - Basic knowledge of vector databases.
@@ -34,6 +35,7 @@ operatingsystems:
 tools_software_languages:
     - Python
     - Streamlit
+    - Google Axion
 
 ### FIXED, DO NOT MODIFY
 # ================================================================================
diff --git a/content/learning-paths/servers-and-cloud-computing/rag/chatbot.md b/content/learning-paths/servers-and-cloud-computing/rag/chatbot.md
@@ -7,16 +7,28 @@ layout: learningpathall
 
 ## Access the Web Application
 
-Open the web application in your browser using either the local URL or the external URL:
+Open the web application in your browser using the external URL:
 
 ```bash
-http://localhost:8501 or http://75.101.253.177:8501
+http://[your instance ip]:8501
 ```
 
 {{% notice Note %}}
 
 To access the links you may need to allow inbound TCP traffic in your instance's security rules. Always review these permissions with caution as they may introduce security vulnerabilities.
 
+For an Axion instance, this can be done as follows from the gcloud cli:
+
+gcloud compute firewall-rules create allow-my-ip \
+    --direction=INGRESS \
+    --network=default \
+    --action=ALLOW \
+    --rules=tcp:8501 \
+    --source-ranges=[your IP]/32 \
+    --target-tags=allow-my-ip
+
+For this to work, you must ensure that the allow-my-ip tag is present on your Axion instance.
+
 {{% /notice %}}
 ## Upload a PDF File and Create a New Index
 
diff --git a/content/learning-paths/servers-and-cloud-computing/rag/rag_llm.md b/content/learning-paths/servers-and-cloud-computing/rag/rag_llm.md
@@ -10,7 +10,7 @@ layout: "learningpathall"
 
 ## Before you begin
 
-This learning path demonstrates how to build and deploy a Retrieval Augmented Generation (RAG) enabled chatbot using open-source Large Language Models (LLMs) optimized for Arm architecture. The chatbot processes documents, stores them in a vector database, and generates contextually-relevant responses by combining the LLM's capabilities with retrieved information. The instructions in this Learning Path have been designed for Arm servers running Ubuntu 22.04 LTS. You need an Arm server instance with at least 16 cores and 8GB of RAM to run this example. Configure disk storage up to at least 32GB. The instructions have been tested on an AWS Graviton4 r8g.16xlarge instance.
+This learning path demonstrates how to build and deploy a Retrieval Augmented Generation (RAG) enabled chatbot using open-source Large Language Models (LLMs) optimized for Arm architecture. The chatbot processes documents, stores them in a vector database, and generates contextually-relevant responses by combining the LLM's capabilities with retrieved information. The instructions in this Learning Path have been designed for Arm servers running Ubuntu 22.04 LTS. You need an Arm server instance with at least 16 cores, 8GB of RAM, and a 32GB disk to run this example. The instructions have been tested on a GCP c4a-standard-64 instance.
 
 ## Overview