Skip to content

Commit 6c0b70f

Browse files
Zach LasiukZach Lasiuk
authored andcommitted
finalized RAG example
1 parent 6fee8a3 commit 6c0b70f

File tree

10 files changed

+48
-518
lines changed

10 files changed

+48
-518
lines changed

content/learning-paths/servers-and-cloud-computing/aaaaaaRAGexample/_index.md

Lines changed: 0 additions & 37 deletions
This file was deleted.

content/learning-paths/servers-and-cloud-computing/aaaaaaRAGexample/llama-chatbot.md

Lines changed: 0 additions & 279 deletions
This file was deleted.

content/learning-paths/servers-and-cloud-computing/aaaaaaRAGexample/llama-server.md

Lines changed: 0 additions & 145 deletions
This file was deleted.

content/learning-paths/servers-and-cloud-computing/rag/_demo.md

Lines changed: 11 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -2,13 +2,17 @@
22
title: Run a llama.cpp chatbot powered by Arm Kleidi technology
33

44
overview: |
5-
Some description of this sucker.
5+
This Arm learning path shows how to use a single c4a-standard-64 Google Axion instance -- powered by an Arm Neoverse CPU -- to build a simple "Token as a Service" RAG-enabled server, used below to provide a chatbot to serve a small number of concurrent users.
6+
7+
This architecture would be suitable for businesses looking to deploy the latest Generative AI technologies with RAG capabilities using their existing CPU compute capacity and deployment pipelines. It enables semantic search over chunked documents using FAISS vector store. The demo uses the open source llama.cpp framework, which Arm has enhanced by contributing the latest Arm Kleidi technologies. Further optimizations are achieved by using the smaller 8 billion parameter Llama 3.1 model, which has been quantized to optimize memory usage.
8+
9+
Chat with the Llama-3.1-8B RAG-enabled LLM below to see the performance for yourself, then follow the learning path to build your own Generative AI service on Arm Neoverse.
610
711
812
demo_steps:
913
- Type & send a message to the chatbot.
10-
- Receive the chatbot's reply.
11-
- View stats showing how well AWS Graviton runs LLMs.
14+
- Receive the chatbot's reply, including references from RAG data.
15+
- View stats showing how well Google Axion runs LLMs.
1216

1317
diagram: config-diagram-dark.png
1418
diagram_blowup: config-diagram.png
@@ -18,9 +22,10 @@ terms_and_conditions: demo-terms-and-conditions.txt
1822
prismjs: true # enable prismjs rendering of code snippets
1923

2024
example_user_prompts:
21-
- Do Hyperscan and Snort3 work on Graviton4?
22-
- How can I easily build multi-architecture Docker images?
23-
25+
- How can I build multi-architecture Docker images?
26+
- How do I test Java performance on Google Axion instances?
27+
28+
2429
rag_data_cutoff_date: 2025/01/17
2530

2631
title_chatbot_area: Arm RAG Demo
Lines changed: 1 addition & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
next_step_guidance: >
3-
Thank you for completing this Learning path on how to run a LLM chatbot on an Arm-based server. You might be interested in learning how to run a NLP sentiment analysis model on an Arm-based server.
3+
Thank you for completing this Learning path on how to run a RAG-enabled LLM chatbot on an Arm-based server. You might be interested in learning how to run a NLP sentiment analysis model on an Arm-based server.
44
55
recommended_path: "/learning-paths/servers-and-cloud-computing/nlp-hugging-face/"
66

@@ -17,10 +17,6 @@ further_reading:
1717
title: Democratizing Generative AI with CPU-based inference
1818
link: https://blogs.oracle.com/ai-and-datascience/post/democratizing-generative-ai-with-cpu-based-inference
1919
type: blog
20-
- resource:
21-
title: Llama-2-7B-Chat-GGUF
22-
link: https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGUF
23-
type: website
2420

2521

2622
# ================================================================================

content/learning-paths/servers-and-cloud-computing/rag/_review.md

Lines changed: 0 additions & 45 deletions
This file was deleted.
224 Bytes
Loading
300 Bytes
Loading

themes/arm-design-system-hugo-theme/layouts/partials/demo-components/config-rag.html

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,10 @@
2222
<div class="c-row u-gap-1/2 u-flex-nowrap u-padding-top-0">
2323
<div class="c-col">
2424
<h2>RAG Vector Store Details</h2>
25-
<p>This app uses all data on this site, <a href="https://www.learn.arm.com">learn.arm.com</a>, as the RAG data set. The Markdown formatted content across Learning Paths and Install Guides was segmented into labeled chunks, and vector embeddings were generated. FAISS is used for the embedded similarity search. The LLM demo below references this vector store for your query.</p>
25+
<p>This application uses all data on <a href="https://www.learn.arm.com">learn.arm.com</a>
26+
as the RAG dataset. The content across Learning Paths and Install Guides is segmented into labeled chunks,
27+
and vector embeddings are generated.
28+
This LLM demo references the FAISS vector store to answer your query.</p>
2629
<p><b>Note:</b> Data was sourced on {{.Params.rag_data_cutoff_date}}.</p>
2730
</div>
2831
</div>

themes/arm-design-system-hugo-theme/layouts/partials/demo-components/llm-chatbot/javascript--llm-chatbot.html

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -232,6 +232,38 @@
232232

233233
const renderer = new marked.Renderer();
234234

235+
renderer.link = (link) => {
236+
// Extract the link parts
237+
const href = link.href;
238+
const text = link.text;
239+
const title = link.title;
240+
241+
// Escape href to prevent XSS attacks
242+
const escapedHref = href
243+
.replace(/&/g, '&amp;')
244+
.replace(/</g, '&lt;')
245+
.replace(/>/g, '&gt;')
246+
.replace(/"/g, '&quot;')
247+
.replace(/'/g, '&#39;');
248+
249+
// Escape title if it exists
250+
const escapedTitle = title
251+
? title
252+
.replace(/&/g, '&amp;')
253+
.replace(/</g, '&lt;')
254+
.replace(/>/g, '&gt;')
255+
.replace(/"/g, '&quot;')
256+
.replace(/'/g, '&#39;')
257+
: '';
258+
259+
// Create the link element with target="_blank"
260+
return `
261+
<a href="${escapedHref}"${escapedTitle ? ` title="${escapedTitle}"` : ''} target="_blank" rel="noopener noreferrer">
262+
${text}
263+
</a>
264+
`.replace(/\n\s+/g, ''); // Remove unnecessary newlines and spaces
265+
};
266+
235267
// Customize the code block rendering
236268
renderer.code = (code, language) => {
237269
var language = code['lang'];

0 commit comments

Comments
 (0)