Skip to content

Commit 36ee06d

Browse files
authored
docs : add build instructions for KleidiAI (#12563)
Signed-off-by: Dan Johansson <[email protected]>
1 parent 3cd3a39 commit 36ee06d

File tree

1 file changed

+20
-0
lines changed

1 file changed

+20
-0
lines changed

docs/build.md

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -435,6 +435,26 @@ llama_new_context_with_model: CANN compute buffer size = 1260.81 MiB
435435

436436
For detailed info, such as model/device supports, CANN install, please refer to [llama.cpp for CANN](./backend/CANN.md).
437437

438+
## Arm® KleidiAI™
439+
KleidiAI is a library of optimized microkernels for AI workloads, specifically designed for Arm CPUs. These microkernels enhance performance and can be enabled for use by the CPU backend.
440+
441+
To enable KleidiAI, go to the llama.cpp directory and build using CMake
442+
```bash
443+
cmake -B build -DGGML_CPU_KLEIDIAI=ON
444+
cmake --build build --config Release
445+
```
446+
You can verify that KleidiAI is being used by running
447+
```bash
448+
./build/bin/llama-cli -m PATH_TO_MODEL -p "What is a car?"
449+
```
450+
If KleidiAI is enabled, the ouput will contain a line similar to:
451+
```
452+
load_tensors: CPU_KLEIDIAI model buffer size = 3474.00 MiB
453+
```
454+
KleidiAI's microkernels implement optimized tensor operations using Arm CPU features such as dotprod, int8mm and SME. llama.cpp selects the most efficient kernel based on runtime CPU feature detection. However, on platforms that support SME, you must manually enable SME microkernels by setting the environment variable `GGML_KLEIDIAI_SME=1`.
455+
456+
Depending on your build target, other higher priority backends may be enabled by default. To ensure the CPU backend is used, you must disable the higher priority backends either at compile time, e.g. -DGGML_METAL=OFF, or during run-time using the command line option `--device none`.
457+
438458
## Android
439459

440460
To read documentation for how to build on Android, [click here](./android.md)

0 commit comments

Comments
 (0)