Skip to content

Commit e332fe3

Browse files
Merge pull request #2097 from zachlasiuk/main
TEMPLATE: Added demo for phi onnx server & cloud lp
2 parents 08bf20a + bced6fe commit e332fe3

File tree

9 files changed

+75
-3
lines changed

9 files changed

+75
-3
lines changed
Lines changed: 61 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,61 @@
1+
---
2+
title: Run a Phi-4-mini chatbot powered by ONNX Runtime
3+
weight: 2
4+
5+
overview: |
6+
This Learning Path shows you how to use a 32-core Azure Dpls_v6 instance powered by an Arm Neoverse-N2 CPU to build a simple chatbot server that you can then use to provide a chatbot to serve a small number of concurrent users.
7+
8+
This architecture is suitable for businesses looking to deploy the latest Generative AI technologies with RAG capabilities using their existing CPU compute capacity and deployment pipelines.
9+
10+
The demo uses the ONNX runtime, which Arm has enhanced with its own Kleidi technologies. Further optimizations are achieved by using the smaller Phi-4-mini model, which has been optimized at INT4 quantization to minimize memory usage.
11+
12+
Chat with the chatbot LLM below to see the performance for yourself, and then follow the Learning Path to build your own Generative AI service on Arm Neoverse.
13+
14+
15+
demo_steps:
16+
- Type and send a message to the chatbot.
17+
- Receive the chatbot's reply.
18+
- View performance statistics demonstrating how well Azure Cobalt 100 instances run LLMs.
19+
20+
diagram: config-diagram-dark.png
21+
diagram_blowup: config-diagram.png
22+
23+
terms_and_conditions: demo-terms-and-conditions.txt
24+
25+
prismjs: true # enable prismjs rendering of code snippets
26+
27+
example_user_prompts:
28+
- Prompt 1?
29+
- Prompt 2?
30+
31+
32+
rag_data_cutoff_date: 2025/01/17
33+
34+
title_chatbot_area: Phi-4-mini Chatbot Demo
35+
36+
prismjs: true
37+
38+
39+
40+
### Specific details to this demo
41+
# ================================================================================
42+
tps_max: 30 # sets stat visuals for tps
43+
tps_ranges:
44+
- name: Low
45+
context: Around the average human reading rate of 3-5 words per second.
46+
color: var(--arm-green)
47+
min: 0
48+
max: 5
49+
- name: High
50+
context: This is significantly higher than the average human reading rate of 5 words per second, delivering a stable and usable user chatbot experience from the Phi-4-mini LLM using the ONNX runtime.
51+
color: var(--arm-green)
52+
min: 5
53+
max: 1000
54+
55+
### FIXED, DO NOT MODIFY
56+
# ================================================================================
57+
demo_template_name: phi_onnx_chatbot_demo # allows the 'demo.html' partial to route to the correct Configuration and Demo/Stats sub partials for page render.
58+
weight: 2 # _index.md always has weight of 1 to order correctly
59+
layout: "learningpathall" # All files under learning paths have this same wrapper
60+
learning_path_main_page: "yes" # This should be surfaced when looking for related content. Only set for _index.md of learning path content.
61+
---

content/learning-paths/servers-and-cloud-computing/onnx/analysis.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
title: Interact with the Phi-4-mini Chatbot
3-
weight: 4
3+
weight: 5
44

55
layout: learningpathall
66
---
1011 Bytes
Loading

content/learning-paths/servers-and-cloud-computing/onnx/chatbot.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
title: Run the Chatbot Server
3-
weight: 3
3+
weight: 4
44

55
layout: learningpathall
66
---
26.3 KB
Loading
27.2 KB
Loading

content/learning-paths/servers-and-cloud-computing/onnx/setup.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
# User change
33
title: "Build ONNX Runtime and set up the Phi-4-mini Model"
44

5-
weight: 2
5+
weight: 3
66

77
# Do not modify these elements
88
layout: "learningpathall"

themes/arm-design-system-hugo-theme/layouts/partials/demo-components/llm-chatbot/javascript--llm-chatbot.html

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -440,6 +440,7 @@
440440
all_messages_div.removeChild(all_messages_div.firstChild);
441441
}
442442
{{ else if eq .Params.demo_template_name "llm_chatbot_first_demo" }}
443+
{{ else if eq .Params.demo_template_name "phi_onnx_chatbot_demo" }}
443444
{{ else }}
444445
{{ end }}
445446

@@ -629,6 +630,9 @@
629630
{{ else if eq .Params.demo_template_name "llm_chatbot_first_demo" }}
630631
{{ $server_location = getenv "HUGO_LLM_API" | base64Encode }}
631632
console.log('Using LLM API.');
633+
{{ else if eq .Params.demo_template_name "phi_onnx_chatbot_demo" }}
634+
{{ $server_location = getenv "HUGO_PHI_ONNX_LLM_API" | base64Encode }}
635+
console.log('Using HUGO_PHI_ONNX_LLM_API.');
632636
{{ else }}
633637
console.log('No server location provided.');
634638
{{ end }}

themes/arm-design-system-hugo-theme/layouts/partials/learning-paths/demo.html

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,9 @@
2424
{{else if eq .Params.demo_template_name "whisper_audio_demo"}}
2525
{{/* {{partial "demo-components/config-params-only.html" .}} */}}
2626

27+
{{else if eq .Params.demo_template_name "phi_onnx_chatbot_demo"}}
28+
{{/* {{partial "demo-components/config-params-only.html" .}} */}}
29+
2730
{{else if eq .Params.demo_template_name "kubectl_demo"}}
2831
{{partial "demo-components/config-param-and-file.html" .}}
2932

@@ -42,6 +45,10 @@
4245
{{partial "demo-components/llm-voice-transcriber/demo-stats--llm-voice-transcriber.html" .}}
4346
{{partial "demo-components/llm-voice-transcriber/javascript--llm-voice-transcriber.html" .}}
4447

48+
{{else if eq .Params.demo_template_name "phi_onnx_chatbot_demo"}}
49+
{{partial "demo-components/llm-chatbot/demo-stats--llm-chatbot.html" .}}
50+
{{partial "demo-components/llm-chatbot/javascript--llm-chatbot.html" .}}
51+
4552
{{else if eq .Params.demo_template_name "kubectl_demo"}}
4653
{{partial "demo-components/demo--kubectl.html" .}}
4754

0 commit comments

Comments
 (0)