You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/build.md
+15-11Lines changed: 15 additions & 11 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -63,11 +63,11 @@ When built with Metal support, you can explicitly disable GPU inference with the
63
63
64
64
Building the program with BLAS support may lead to some performance improvements in prompt processing using batch sizes higher than 32 (the default is 512). Using BLAS doesn't affect the generation performance. There are currently several different BLAS implementations available for build and use:
65
65
66
-
### Accelerate Framework:
66
+
### Accelerate Framework
67
67
68
68
This is only available on Mac PCs and it's enabled by default. You can just build using the normal instructions.
69
69
70
-
### OpenBLAS:
70
+
### OpenBLAS
71
71
72
72
This provides BLAS acceleration using only the CPU. Make sure to have OpenBLAS installed on your machine.
73
73
@@ -82,15 +82,7 @@ This provides BLAS acceleration using only the CPU. Make sure to have OpenBLAS i
82
82
83
83
Check [BLIS.md](./backend/BLIS.md) for more information.
84
84
85
-
## SYCL
86
-
87
-
SYCL is a higher-level programming model to improve programming productivity on various hardware accelerators.
88
-
89
-
llama.cpp based on SYCL is used to **support Intel GPU** (Data Center Max series, Flex series, Arc series, Built-in GPU and iGPU).
90
-
91
-
For detailed info, please refer to [llama.cpp for SYCL](./backend/SYCL.md).
92
-
93
-
## Intel oneMKL
85
+
### Intel oneMKL
94
86
95
87
Building through oneAPI compilers will make avx_vnni instruction set available for intel processors that do not support avx512 and avx512_vnni. Please note that this build config **does not support Intel GPU**. For Intel GPU support, please refer to [llama.cpp for SYCL](./backend/SYCL.md).
96
88
@@ -107,6 +99,18 @@ Building through oneAPI compilers will make avx_vnni instruction set available f
107
99
108
100
Check [Optimizing and Running LLaMA2 on Intel® CPU](https://www.intel.com/content/www/us/en/content-details/791610/optimizing-and-running-llama2-on-intel-cpu.html) for more information.
109
101
102
+
### Other BLAS libraries
103
+
104
+
Any other BLAS library can be used by setting the `GGML_BLAS_VENDOR` option. See the [CMake documentation](https://cmake.org/cmake/help/latest/module/FindBLAS.html#blas-lapack-vendors) for a list of supported vendors.
105
+
106
+
## SYCL
107
+
108
+
SYCL is a higher-level programming model to improve programming productivity on various hardware accelerators.
109
+
110
+
llama.cpp based on SYCL is used to **support Intel GPU** (Data Center Max series, Flex series, Arc series, Built-in GPU and iGPU).
111
+
112
+
For detailed info, please refer to [llama.cpp for SYCL](./backend/SYCL.md).
113
+
110
114
## CUDA
111
115
112
116
This provides GPU acceleration using an NVIDIA GPU. Make sure to have the CUDA toolkit installed. You can download it from your Linux distro's package manager (e.g. `apt install nvidia-cuda-toolkit`) or from here: [CUDA Toolkit](https://developer.nvidia.com/cuda-downloads).
0 commit comments