Skip to content

Commit 8efa4aa

Browse files
committed
docs: add zDNN docs
Signed-off-by: Aaron Teo <[email protected]>
1 parent 7505ca8 commit 8efa4aa

File tree

2 files changed

+62
-0
lines changed

2 files changed

+62
-0
lines changed

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -274,6 +274,7 @@ Instructions for adding support for new models: [HOWTO-add-model.md](docs/develo
274274
| [Vulkan](docs/build.md#vulkan) | GPU |
275275
| [CANN](docs/build.md#cann) | Ascend NPU |
276276
| [OpenCL](docs/backend/OPENCL.md) | Adreno GPU |
277+
| [IBM zDNN](docs/backend/zDNN.md) | IBM Z & LinuxONE |
277278
| [WebGPU [In Progress]](docs/build.md#webgpu) | All |
278279
| [RPC](https://github.com/ggml-org/llama.cpp/tree/master/tools/rpc) | All |
279280

docs/backend/zDNN.md

Lines changed: 61 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,61 @@
1+
# llama.cpp for IBM zDNN Accelerator
2+
3+
## Background
4+
5+
IBM zDNN (Z Deep Neural Network) is a hardware acceleration library designed specifically to leverage the IBM NNPA (Neural Network Processor Assist) accelerator located within IBM Telum I and II processors. It provides significant performance improvements for neural network inference operations.
6+
7+
### Llama.cpp + IBM zDNN
8+
9+
The llama.cpp zDNN backend is designed to enable llama.cpp on IBM z17 and later systems via the IBM zDNN hardware acceleration library.
10+
11+
## Software & Hardware Support
12+
13+
| Hardware Level | Status | Verified |
14+
| -------------------- | ------------- | -------------------------- |
15+
| IBM z17 / LinuxONE 5 | Supported | RHEL 9.6, IBM z17, 40 IFLs |
16+
| IBM z16 / LinuxONE 4 | Not Supported | |
17+
18+
## Data Types Supported
19+
20+
| Data Type | Status |
21+
| --------- | --------- |
22+
| F32 | Supported |
23+
| F16 | Supported |
24+
| BF16 | Supported |
25+
26+
## CMake Options
27+
28+
The IBM zDNN backend has the following CMake options that control the behaviour of the backend.
29+
30+
| CMake Option | Default Value | Description |
31+
| ------------ | ------------- | ----------------------------------- |
32+
| `GGML_ZDNN` | `OFF` | Compile llama.cpp with zDNN support |
33+
| `ZDNN_ROOT` | `""` | Override zDNN library lookup |
34+
35+
## 1. Install zDNN Library
36+
37+
Note: Using the zDNN library provided via `apt` or `yum` may not work correctly as reported in [#15772](https://github.com/ggml-org/llama.cpp/issues/15772). It is preferred that you compile from source.
38+
39+
```sh
40+
git clone --recurse-submodules https://github.com/IBM/zDNN
41+
cd zDNN
42+
43+
autoreconf .
44+
./configure --prefix=/opt/zdnn-libs
45+
46+
make build
47+
sudo make install
48+
```
49+
50+
## 2. Build llama.cpp
51+
52+
```sh
53+
git clone https://github.com/ggml-org/llama.cpp
54+
cd llama.cpp
55+
56+
cmake -S . -G Ninja -B build \
57+
-DCMAKE_BUILD_TYPE=Release \
58+
-DGGML_ZDNN=ON \
59+
-DZDNN_ROOT=/opt/zdnn-libs
60+
cmake --build build --config Release -j$(nproc)
61+
```

0 commit comments

Comments
 (0)