|  | 
|  | 1 | +# llama.cpp for IBM zDNN Accelerator | 
|  | 2 | + | 
|  | 3 | +## Background | 
|  | 4 | + | 
|  | 5 | +IBM zDNN (Z Deep Neural Network) is a hardware acceleration library designed specifically to leverage the IBM NNPA (Neural Network Processor Assist) accelerator located within IBM Telum I and II processors. It provides significant performance improvements for neural network inference operations. | 
|  | 6 | + | 
|  | 7 | +### Llama.cpp + IBM zDNN | 
|  | 8 | + | 
|  | 9 | +The llama.cpp zDNN backend is designed to enable llama.cpp on IBM z17 and later systems via the IBM zDNN hardware acceleration library. | 
|  | 10 | + | 
|  | 11 | +## Software & Hardware Support | 
|  | 12 | + | 
|  | 13 | +| Hardware Level       | Status        | Verified                   | | 
|  | 14 | +| -------------------- | ------------- | -------------------------- | | 
|  | 15 | +| IBM z17 / LinuxONE 5 | Supported     | RHEL 9.6, IBM z17, 40 IFLs | | 
|  | 16 | +| IBM z16 / LinuxONE 4 | Not Supported |                            | | 
|  | 17 | + | 
|  | 18 | +## Data Types Supported | 
|  | 19 | + | 
|  | 20 | +| Data Type | Status    | | 
|  | 21 | +| --------- | --------- | | 
|  | 22 | +| F32       | Supported | | 
|  | 23 | +| F16       | Supported | | 
|  | 24 | +| BF16      | Supported | | 
|  | 25 | + | 
|  | 26 | +## CMake Options | 
|  | 27 | + | 
|  | 28 | +The IBM zDNN backend has the following CMake options that control the behaviour of the backend. | 
|  | 29 | + | 
|  | 30 | +| CMake Option | Default Value | Description                         | | 
|  | 31 | +| ------------ | ------------- | ----------------------------------- | | 
|  | 32 | +| `GGML_ZDNN`  | `OFF`         | Compile llama.cpp with zDNN support | | 
|  | 33 | +| `ZDNN_ROOT`  | `""`          | Override zDNN library lookup        | | 
|  | 34 | + | 
|  | 35 | +## 1. Install zDNN Library | 
|  | 36 | + | 
|  | 37 | +Note: Using the zDNN library provided via `apt` or `yum` may not work correctly as reported in [#15772](https://github.com/ggml-org/llama.cpp/issues/15772). It is preferred that you compile from source. | 
|  | 38 | + | 
|  | 39 | +```sh | 
|  | 40 | +git clone --recurse-submodules https://github.com/IBM/zDNN | 
|  | 41 | +cd zDNN | 
|  | 42 | + | 
|  | 43 | +autoreconf . | 
|  | 44 | +./configure --prefix=/opt/zdnn-libs | 
|  | 45 | + | 
|  | 46 | +make build | 
|  | 47 | +sudo make install | 
|  | 48 | +``` | 
|  | 49 | + | 
|  | 50 | +## 2. Build llama.cpp | 
|  | 51 | + | 
|  | 52 | +```sh | 
|  | 53 | +git clone https://github.com/ggml-org/llama.cpp | 
|  | 54 | +cd llama.cpp | 
|  | 55 | + | 
|  | 56 | +cmake -S . -G Ninja -B build \ | 
|  | 57 | +    -DCMAKE_BUILD_TYPE=Release \ | 
|  | 58 | +    -DGGML_ZDNN=ON \ | 
|  | 59 | +    -DZDNN_ROOT=/opt/zdnn-libs | 
|  | 60 | +cmake --build build --config Release -j$(nproc) | 
|  | 61 | +``` | 
0 commit comments