Skip to content

Commit 0989aa2

Browse files
pytorchbotmcr229
andauthored
Update llama README to reflect kleidiai is default (#12905)
KleidiAI is now on by default for the XNNPACK Backend, we update the README in models/llama to reflect this Co-authored-by: Max Ren <[email protected]>
1 parent 6b2ff1f commit 0989aa2

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

examples/models/llama/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -65,7 +65,7 @@ Please see the [Llama 3.2 model card](https://github.com/meta-llama/llama-models
6565

6666
### Performance
6767

68-
Llama 3.2 1B and 3B performance was measured on Android OnePlus 12 device. The performance measurement is expressed in terms of tokens per second using an [adb binary-based approach](#step-4-run-benchmark-on-android-phone) with prompt length of 64. It is measured with KleidiAI library. KleidiAI is not enabled by default yet. Use `-DEXECUTORCH_XNNPACK_ENABLE_KLEIDI=ON` to enable it in the build.
68+
Llama 3.2 1B and 3B performance was measured on Android OnePlus 12 device. The performance measurement is expressed in terms of tokens per second using an [adb binary-based approach](#step-4-run-benchmark-on-android-phone) with prompt length of 64. It is measured with KleidiAI library. KleidiAI is enabled by default on the XNNPACK Backend for all ARM devices.
6969

7070
|Model | Decode (tokens/s) | Time-to-first-token (sec) | Prefill (tokens/s) | Model size (PTE file size in MiB) | Memory size (RSS in MiB) |
7171
|-------|------------------:|--------------------------:| ------------------:|----------------------------------:| ------------------------:|

0 commit comments

Comments
 (0)