Skip to content

Commit ba24e84

Browse files
authored
Merge pull request #2295 from ArmDeveloperEcosystem/main
Production update
2 parents 09def7a + 343d0a1 commit ba24e84

File tree

48 files changed

+1279
-706
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

48 files changed

+1279
-706
lines changed

.wordlist.txt

Lines changed: 56 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4666,5 +4666,60 @@ crosh
46664666
Sommelier
46674667
chromeos
46684668
linuxcontainers
4669-
4669+
XPS
4670+
NIC's
4671+
offlines
4672+
passthrough
4673+
SLOs
4674+
Ker
4675+
Rui
4676+
SmartNICs
4677+
selectedalt
4678+
UIalt
4679+
lpprojectubuntuarm
4680+
RDV
4681+
chiplet
4682+
BMC
4683+
upstreams
4684+
rdv
4685+
Initrd
4686+
handoff
4687+
ACPI
4688+
PCRs
4689+
MHU
4690+
Handoff
4691+
userland
4692+
CXL
4693+
DDR
4694+
PHYs
4695+
UCIe
4696+
handoffs
4697+
CCG
4698+
CML
4699+
Codespaces
4700+
Cheng
4701+
GDM
4702+
LPI
4703+
nsec
4704+
shortcode
4705+
BSON
4706+
joedog
4707+
Seige
4708+
Antonov
4709+
jwt
4710+
kbs
4711+
Nfpl
4712+
ZjnAMjLk
4713+
hCpeYsarnnGv
4714+
kbs
4715+
rvps
4716+
xcbTMTBX
4717+
CDH
4718+
RVPS
4719+
Attester
4720+
attester
4721+
ATtestation
4722+
CoCo
4723+
procedureS
4724+
NIC’s
46704725

content/learning-paths/cross-platform/_example-learning-path/appendix-1-formatting.md

Lines changed: 20 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -83,12 +83,26 @@ Specify that line_numbers are true in the following way:
8383
\`\`\`bash { line_numbers = "true" } \
8484
echo 'hello world' \
8585
echo ‘I am line two’ \
86-
\`\`\`
86+
\`\`\`
87+
88+
```bash { line_numbers = "true" }
89+
echo ‘hello world’
90+
echo ‘I am line two’
91+
```
92+
93+
In some cases, the line numbering should not start from one but from another
94+
value, e.g. if the code excerpt is extracted from a larger file. Use the
95+
`line_start` attribute to achieve this:
8796

88-
```bash { line_numbers = "true" }
89-
echo ‘hello world’
90-
echo ‘I am line two’
91-
```
97+
\`\`\`bash { line_numbers = "true" line_start = "10" } \
98+
echo 'hello world' \
99+
echo ‘I am line two’ \
100+
\`\`\`
101+
102+
```bash { line_numbers = "true" line_start = "10" }
103+
echo ‘hello world’
104+
echo ‘I am line eleven’
105+
```
92106

93107
### Output Lines
94108

@@ -100,7 +114,7 @@ There are three ways you can specify command outputs in code:
100114
{{% notice Note %}}
101115
In each of the three situations, code marked as 'output' will:
102116
- not be copied when clicking the 'copy' button
103-
- not be highlightable by a cursor
117+
- not be highlighted by a cursor
104118
- appear slightly darker
105119
{{% /notice %}}
106120

content/learning-paths/cross-platform/floating-point-rounding-errors/_index.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,10 @@
11
---
22
title: Explore floating-point differences between x86 and Arm
33

4+
draft: true
5+
cascade:
6+
draft: true
7+
48
minutes_to_complete: 30
59

610
who_is_this_for: This is an introductory topic for developers who are porting applications from x86 to Arm and want to understand how floating-point behavior differs between these architectures - particularly in the context of numerical consistency, performance, and debugging subtle bugs.

content/learning-paths/embedded-and-microcontrollers/training-inference-pytorch/_index.md

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -7,41 +7,41 @@ cascade:
77

88
minutes_to_complete: 90
99

10-
who_is_this_for: This topic is for machine learning engineers, embedded AI developers, and researchers interested in deploying TinyML models for NLP on Arm-based edge devices using PyTorch and ExecuTorch.
10+
who_is_this_for: This topic is for machine learning engineers, embedded AI developers, and researchers interested in deploying TinyML models for NLP on Arm-based edge devices using PyTorch and ExecuTorch.
1111

12-
learning_objectives:
12+
learning_objectives:
1313
- Train a custom CNN-based sentiment classification model implemented in PyTorch.
1414
- Optimize and convert the model using ExecuTorch for Arm-based edge devices.
1515
- Deploy and run inference on the Corstone-320 FVP.
1616

1717
prerequisites:
18-
- Basic knowledge of machine learning concepts.
19-
- It is advised to complete The Learning Path, [Introduction to TinyML on Arm using PyTorch and ExecuTorch](/learning-paths/embedded-and-microcontrollers/introduction-to-tinyml-on-arm) before starting this learning path.
18+
- Basic knowledge of machine learning concepts.
19+
- It is advised to complete The Learning Path, [Introduction to TinyML on Arm using PyTorch and ExecuTorch](/learning-paths/embedded-and-microcontrollers/introduction-to-tinyml-on-arm) before starting this learning path.
2020
- Familiarity with Python and PyTorch.
2121
- A Linux host machine or VM running Ubuntu 22.04 or higher.
22-
- An Arm license to run the examples on the Corstone-320 Fixed Virtual Platform (FVP), for hands-on deployment.
22+
- An Arm license to run the examples on the Corstone-320 Fixed Virtual Platform (FVP), for hands-on deployment.
2323

2424

2525
author: Dominica Abena O. Amanfo
2626

2727
### Tags
28-
skilllevels: Intermediate
28+
skilllevels: Introductory
2929
subjects: ML
3030
armips:
3131
- Cortex-A
3232
tools_software_languages:
33-
- tinyML
34-
- CNN
33+
- tinyML
34+
- CNN
3535
- PyTorch
3636
- ExecuTorch
37-
37+
3838
operatingsystems:
3939
- Linux
4040

4141

4242
further_reading:
4343
- resource:
44-
title: Run Llama 3 on a Raspberry Pi 5 using ExecuTorch
44+
title: Run Llama 3 on a Raspberry Pi 5 using ExecuTorch
4545
link: /learning-paths/embedded-and-microcontrollers/rpi-llama3
4646
type: website
4747
- resource:

content/learning-paths/embedded-and-microcontrollers/uvprojx-conversion/_index.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ minutes_to_complete: 10
55

66
who_is_this_for: This is a topic for users of µVision who want to migrate to the new project format (csolution) required by CMSIS-Toolbox.
77

8-
learning_objectives:
8+
learning_objectives:
99
- Import, convert, and build uvprojx-based projects in Keil Studio.
1010
- Convert uvprojx-based projects in µVision.
1111
- Convert and build uvprojx-based projects on the command line.
@@ -19,7 +19,7 @@ prerequisites:
1919
author: Christopher Seidl
2020

2121
### Tags
22-
skilllevels: Intermediate
22+
skilllevels: Advanced
2323
subjects: Performance and Architecture
2424
armips:
2525
- Cortex-M
@@ -43,7 +43,7 @@ further_reading:
4343
link: https://community.arm.com/arm-community-blogs/b/internet-of-things-blog/posts/keil-mdk-version-6
4444
type: blog
4545
- resource:
46-
title: keil.arm.com
46+
title: keil.arm.com
4747
link: https://keil.arm.com
4848
type: website
4949

content/learning-paths/laptops-and-desktops/self_hosted_cicd_github/_index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ prerequisites:
1818
author: Dawid Borycki
1919

2020
### Tags
21-
skilllevels: Intermediate
21+
skilllevels: Introductory
2222
subjects: Migration to Arm
2323
armips:
2424
- Cortex-A

content/learning-paths/mobile-graphics-and-gaming/afrc/_index.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ minutes_to_complete: 25
55

66
who_is_this_for: Software developers of Android applications and mobile games who are interested in learning how to enable Arm Fixed Rate Compression (AFRC) to improve performance.
77

8-
learning_objectives:
8+
learning_objectives:
99
- Query for fixed-rate compression support.
1010
- Specify what compression to use.
1111
- Verify that compression is applied.
@@ -18,7 +18,7 @@ prerequisites:
1818
author: Jose-Emilio Munoz-Lopez
1919

2020
### Tags
21-
skilllevels: Intermediate
21+
skilllevels: Advanced
2222
subjects: Graphics
2323
armips:
2424
- Mali

content/learning-paths/mobile-graphics-and-gaming/build-llama3-chat-android-app-using-executorch-and-xnnpack/5-run-benchmark-on-android.md

Lines changed: 46 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,7 @@ cmake -DCMAKE_TOOLCHAIN_FILE=$ANDROID_NDK/build/cmake/android.toolchain.cmake \
4545
-DEXECUTORCH_BUILD_KERNELS_CUSTOM=ON \
4646
-DEXECUTORCH_BUILD_KERNELS_LLM=ON \
4747
-DEXECUTORCH_BUILD_EXTENSION_LLM_RUNNER=ON \
48+
-DEXECUTORCH_BUILD_EXTENSION_LLM=ON \
4849
-DEXECUTORCH_BUILD_EXTENSION_RUNNER_UTIL=ON \
4950
-DEXECUTORCH_XNNPACK_ENABLE_KLEIDI=ON \
5051
-DXNNPACK_ENABLE_ARM_BF16=OFF \
@@ -82,6 +83,10 @@ cmake --build cmake-out-android/examples/models/llama -j16 --config Release
8283

8384
You should now have `llama_main` available for Android.
8485

86+
{{% notice Note %}}
87+
If you notice that Gradle cannot find the Android SDK, add the sdk.dir path to executorch/extension/android/local.properties.
88+
{{% /notice %}}
89+
8590
## Run on Android via adb shell
8691
You will need an Arm-powered smartphone with the i8mm feature running Android, with 16GB of RAM. The following steps were tested on a Google Pixel 8 Pro phone.
8792

@@ -103,7 +108,7 @@ You should see your device listed to confirm it is connected.
103108

104109
``` bash
105110
adb shell mkdir -p /data/local/tmp/llama
106-
adb push llama3_1B_kv_sdpa_xnn_qe_4_128_1024_embedding_4bit.pte /data/local/tmp/llama/
111+
adb push llama3_1B_kv_sdpa_xnn_qe_4_64_1024_embedding_4bit.pte /data/local/tmp/llama/
107112
adb push $HOME/.llama/checkpoints/Llama3.2-1B-Instruct/tokenizer.model /data/local/tmp/llama/
108113
adb push cmake-out-android/examples/models/llama/llama_main /data/local/tmp/llama/
109114
```
@@ -114,49 +119,53 @@ adb push cmake-out-android/examples/models/llama/llama_main /data/local/tmp/llam
114119
Use the Llama runner to execute the model on the phone with the `adb` command:
115120

116121
``` bash
117-
adb shell "cd /data/local/tmp/llama && ./llama_main --model_path llama3_1B_kv_sdpa_xnn_qe_4_64_1024_embedding_4bit.pte --tokenizer_path tokenizer.model --prompt "<|start_header_id|>system<|end_header_id|>\nYour name is Cookie. you are helpful, polite, precise, concise, honest, good at writing. You always give precise and brief answers up to 32 words<|eot_id|><|start_header_id|>user<|end_header_id|>\nHey Cookie! how are you today?<|eot_id|><|start_header_id|>assistant<|end_header_id|>" --warmup=1 --cpu_threads=5
122+
adb shell "cd /data/local/tmp/llama && ./llama_main --model_path llama3_1B_kv_sdpa_xnn_qe_4_64_1024_embedding_4bit.pte --tokenizer_path tokenizer.model --prompt "<|start_header_id|>system<|end_header_id|>\nYour name is Cookie. you are helpful, polite, precise, concise, honest, good at writing. You always give precise and brief answers up to 32 words<|eot_id|><|start_header_id|>user<|end_header_id|>\nHey Cookie! how are you today?<|eot_id|><|start_header_id|>assistant<|end_header_id|>" --warmup=1 --cpu_threads=5"
118123
```
119124

120125
The output should look something like this.
121126

122127
```
123-
I 00:00:00.003316 executorch:main.cpp:69] Resetting threadpool with num threads = 5
124-
I 00:00:00.009329 executorch:runner.cpp:59] Creating LLaMa runner: model_path=llama3_1B_kv_sdpa_xnn_qe_4_64_1024_embedding_4bit.pte, tokenizer_path=tokenizer.model
125-
I 00:00:03.569399 executorch:runner.cpp:88] Reading metadata from model
126-
I 00:00:03.569451 executorch:runner.cpp:113] Metadata: use_sdpa_with_kv_cache = 1
127-
I 00:00:03.569455 executorch:runner.cpp:113] Metadata: use_kv_cache = 1
128-
I 00:00:03.569459 executorch:runner.cpp:113] Metadata: get_vocab_size = 128256
129-
I 00:00:03.569461 executorch:runner.cpp:113] Metadata: get_bos_id = 128000
130-
I 00:00:03.569464 executorch:runner.cpp:113] Metadata: get_max_seq_len = 1024
131-
I 00:00:03.569466 executorch:runner.cpp:113] Metadata: enable_dynamic_shape = 1
132-
I 00:00:03.569469 executorch:runner.cpp:120] eos_id = 128009
133-
I 00:00:03.569470 executorch:runner.cpp:120] eos_id = 128001
134-
I 00:00:03.569471 executorch:runner.cpp:120] eos_id = 128006
135-
I 00:00:03.569473 executorch:runner.cpp:120] eos_id = 128007
136-
I 00:00:03.569475 executorch:runner.cpp:168] Doing a warmup run...
137-
I 00:00:03.838634 executorch:text_prefiller.cpp:53] Prefill token result numel(): 128256
138-
139-
I 00:00:03.892268 executorch:text_token_generator.h:118]
128+
I tokenizers:regex.cpp:27] Registering override fallback regex
129+
I 00:00:00.003288 executorch:main.cpp:87] Resetting threadpool with num threads = 5
130+
I 00:00:00.006393 executorch:runner.cpp:44] Creating LLaMa runner: model_path=llama3_1B_kv_sdpa_xnn_qe_4_64_1024_embedding_4bit.pte, tokenizer_path=tokenizer.model
131+
E tokenizers:hf_tokenizer.cpp:60] Error parsing json file: [json.exception.parse_error.101] parse error at line 1, column 1: syntax error while parsing value - invalid literal; last read: 'I'
132+
I 00:00:00.131486 executorch:llm_runner_helper.cpp:57] Loaded TikToken tokenizer
133+
I 00:00:00.131525 executorch:llm_runner_helper.cpp:167] Reading metadata from model
134+
I 00:00:00.186538 executorch:llm_runner_helper.cpp:110] Metadata: use_sdpa_with_kv_cache = 1
135+
I 00:00:00.186574 executorch:llm_runner_helper.cpp:110] Metadata: use_kv_cache = 1
136+
I 00:00:00.186578 executorch:llm_runner_helper.cpp:110] Metadata: get_max_context_len = 1024
137+
I 00:00:00.186584 executorch:llm_runner_helper.cpp:110] Metadata: get_max_seq_len = 1024
138+
I 00:00:00.186588 executorch:llm_runner_helper.cpp:110] Metadata: enable_dynamic_shape = 1
139+
I 00:00:00.186596 executorch:llm_runner_helper.cpp:140] eos_id = 128009
140+
I 00:00:00.186597 executorch:llm_runner_helper.cpp:140] eos_id = 128001
141+
I 00:00:00.186599 executorch:llm_runner_helper.cpp:140] eos_id = 128006
142+
I 00:00:00.186600 executorch:llm_runner_helper.cpp:140] eos_id = 128007
143+
I 00:00:01.086570 executorch:text_llm_runner.cpp:89] Doing a warmup run...
144+
I 00:00:01.087836 executorch:text_llm_runner.cpp:152] Max new tokens resolved: 128, given start_pos 0, num_prompt_tokens 54, max_context_len 1024
145+
I 00:00:01.292740 executorch:text_prefiller.cpp:93] Prefill token result numel(): 128256
146+
147+
I 00:00:02.264371 executorch:text_token_generator.h:123]
140148
Reached to the end of generation
141-
I 00:00:03.892281 executorch:runner.cpp:267] Warmup run finished!
142-
I 00:00:03.892286 executorch:runner.cpp:174] RSS after loading model: 1269.445312 MiB (0 if unsupported)
143-
<|start_header_id|>system<|end_header_id|>\nYour name is Cookie. you are helpful, polite, precise, concise, honest, good at writing. You always give precise and brief answers up to 32 words<|eot_id|><|start_header_id|>user<|end_header_id|>\nHey Cookie! how are you today?<|eot_id|><|start_header_id|>assistant<|end_header_id|>I 00:00:04.076905 executorch:text_prefiller.cpp:53] Prefill token result numel(): 128256
144-
145-
146-
I 00:00:04.078027 executorch:runner.cpp:243] RSS after prompt prefill: 1269.445312 MiB (0 if unsupported)
147-
I'm doing great, thanks! I'm always happy to help, communicate, and provide helpful responses. I'm a bit of a cookie (heh) when it comes to delivering concise and precise answers. What can I help you with today?<|eot_id|>
148-
I 00:00:05.399304 executorch:text_token_generator.h:118]
149+
I 00:00:02.264379 executorch:text_llm_runner.cpp:209] Warmup run finished!
150+
I 00:00:02.264384 executorch:text_llm_runner.cpp:95] RSS after loading model: 1122.187500 MiB (0 if unsupported)
151+
I 00:00:02.264624 executorch:text_llm_runner.cpp:152] Max new tokens resolved: 74, given start_pos 0, num_prompt_tokens 54, max_context_len 1024
152+
<|start_header_id|>system<|end_header_id|>\nYour name is Cookie. you are helpful, polite, precise, concise, honest, good at writing. You always give precise and brief answers up to 32 words<|eot_id|><|start_header_id|>user<|end_header_id|>\nHey Cookie! how are you today?<|eot_id|><|start_header_id|>assistant<|end_header_id|>I 00:00:02.394162 executorch:text_prefiller.cpp:93] Prefill token result numel(): 128256
153+
154+
155+
I 00:00:02.394373 executorch:text_llm_runner.cpp:179] RSS after prompt prefill: 1122.187500 MiB (0 if unsupported)
156+
I'm doing great, thanks for asking! I'm always ready to help, whether it's answering a question or providing a solution. What can I help you with today?<|eot_id|>
157+
I 00:00:03.072966 executorch:text_token_generator.h:123]
149158
Reached to the end of generation
150-
151-
I 00:00:05.399314 executorch:runner.cpp:257] RSS after finishing text generation: 1269.445312 MiB (0 if unsupported)
152-
PyTorchObserver {"prompt_tokens":54,"generated_tokens":51,"model_load_start_ms":1710296339487,"model_load_end_ms":1710296343047,"inference_start_ms":1710296343370,"inference_end_ms":1710296344877,"prompt_eval_end_ms":1710296343556,"first_token_ms":1710296343556,"aggregate_sampling_time_ms":49,"SCALING_FACTOR_UNITS_PER_SECOND":1000}
153-
I 00:00:04.530945 executorch:stats.h:108] Prompt Tokens: 54 Generated Tokens: 69
154-
I 00:00:04.530947 executorch:stats.h:114] Model Load Time: 1.196000 (seconds)
155-
I 00:00:04.530949 executorch:stats.h:124] Total inference time: 1.934000 (seconds) Rate: 35.677353 (tokens/second)
156-
I 00:00:04.530952 executorch:stats.h:132] Prompt evaluation: 0.176000 (seconds) Rate: 306.818182 (tokens/second)
157-
I 00:00:04.530954 executorch:stats.h:143] Generated 69 tokens: 1.758000 (seconds) Rate: 39.249147 (tokens/second)
158-
I 00:00:04.530956 executorch:stats.h:151] Time to first generated token: 0.176000 (seconds)
159-
I 00:00:04.530959 executorch:stats.h:158] Sampling time over 123 tokens: 0.067000 (seconds)
159+
160+
I 00:00:03.072972 executorch:text_llm_runner.cpp:199] RSS after finishing text generation: 1122.187500 MiB (0 if unsupported)
161+
PyTorchObserver {"prompt_tokens":54,"generated_tokens":36,"model_load_start_ms":1756473387815,"model_load_end_ms":1756473388715,"inference_start_ms":1756473389893,"inference_end_ms":1756473390702,"prompt_eval_end_ms":1756473390023,"first_token_ms":1756473390023,"aggregate_sampling_time_ms":22,"SCALING_FACTOR_UNITS_PER_SECOND":1000}
162+
I 00:00:03.072993 executorch:stats.h:108] Prompt Tokens: 54 Generated Tokens: 36
163+
I 00:00:03.072995 executorch:stats.h:114] Model Load Time: 0.900000 (seconds)
164+
I 00:00:03.072996 executorch:stats.h:124] Total inference time: 0.809000 (seconds) Rate: 44.499382 (tokens/second)
165+
I 00:00:03.072998 executorch:stats.h:132] Prompt evaluation: 0.130000 (seconds) Rate: 415.384615 (tokens/second)
166+
I 00:00:03.073000 executorch:stats.h:143] Generated 36 tokens: 0.679000 (seconds) Rate: 53.019146 (tokens/second)
167+
I 00:00:03.073002 executorch:stats.h:151] Time to first generated token: 0.130000 (seconds)
168+
I 00:00:03.073004 executorch:stats.h:158] Sampling time over 90 tokens: 0.022000 (seconds)
160169
```
161170

162171
You have successfully run the Llama 3.1 1B Instruct model on your Android smartphone with ExecuTorch using KleidiAI kernels.
284 KB
Loading
194 KB
Loading

0 commit comments

Comments
 (0)