pytorch · mcr229 · Jul 28, 2025 · Jul 25, 2025
@@ -65,7 +65,7 @@ Please see the [Llama 3.2 model card](https://github.com/meta-llama/llama-models
 
 ### Performance
 
-Llama 3.2 1B and 3B performance was measured on Android OnePlus 12 device. The performance measurement is expressed in terms of tokens per second using an [adb binary-based approach](#step-4-run-benchmark-on-android-phone) with prompt length of 64. It is measured with KleidiAI library. KleidiAI is not enabled by default yet. Use `-DEXECUTORCH_XNNPACK_ENABLE_KLEIDI=ON` to enable it in the build.
+Llama 3.2 1B and 3B performance was measured on Android OnePlus 12 device. The performance measurement is expressed in terms of tokens per second using an [adb binary-based approach](#step-4-run-benchmark-on-android-phone) with prompt length of 64. It is measured with KleidiAI library. KleidiAI is enabled by default on the XNNPACK Backend for all ARM devices.
 
 |Model  | Decode (tokens/s) | Time-to-first-token (sec) | Prefill (tokens/s) | Model size (PTE file size in MiB) | Memory size (RSS in MiB) |
 |-------|------------------:|--------------------------:| ------------------:|----------------------------------:| ------------------------:|