-
Notifications
You must be signed in to change notification settings - Fork 13.5k
sycl: Add option to set the SYCL architecture for all targets #10266
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
|
|
@@ -310,12 +310,14 @@ export CPLUS_INCLUDE_DIR=/path/to/oneMKL/buildWithCublas/include:$CPLUS_INCLUDE_ | |||||
| export CPLUS_INCLUDE_DIR=/path/to/oneMKL/include:$CPLUS_INCLUDE_DIR | ||||||
|
|
||||||
| # Build LLAMA with Nvidia BLAS acceleration through SYCL | ||||||
| # Setting GGML_SYCL_ARCH is optional but can improve performance | ||||||
| GGML_SYCL_ARCH=sm_80 # Example architecture | ||||||
Rbiessy marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||
|
|
||||||
| # Option 1: Use FP32 (recommended for better performance in most cases) | ||||||
| cmake -B build -DGGML_SYCL=ON -DGGML_SYCL_TARGET=NVIDIA -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx | ||||||
| cmake -B build -DGGML_SYCL=ON -DGGML_SYCL_TARGET=NVIDIA -DGGML_SYCL_ARCH=${GGML_SYCL_ARCH} -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx | ||||||
|
|
||||||
| # Option 2: Use FP16 | ||||||
| cmake -B build -DGGML_SYCL=ON -DGGML_SYCL_TARGET=NVIDIA -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx -DGGML_SYCL_F16=ON | ||||||
| cmake -B build -DGGML_SYCL=ON -DGGML_SYCL_TARGET=NVIDIA -DGGML_SYCL_ARCH=${GGML_SYCL_ARCH} -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx -DGGML_SYCL_F16=ON | ||||||
|
|
||||||
| # build all binary | ||||||
| cmake --build build --config Release -j -v | ||||||
|
|
@@ -333,8 +335,9 @@ export CPLUS_INCLUDE_DIR=/path/to/oneMKL/buildWithrocBLAS/include:$CPLUS_INCLUDE | |||||
|
|
||||||
| ## AMD | ||||||
| # Use FP32, FP16 is not supported | ||||||
| # Find your GGML_SYCL_HIP_TARGET with rocminfo, under the key 'Name:' | ||||||
| cmake -B build -DGGML_SYCL=ON -DGGML_SYCL_TARGET=AMD -DGGML_SYCL_HIP_TARGET=${GGML_SYCL_HIP_TARGET} -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx | ||||||
| # Find your GGML_SYCL_ARCH with rocminfo, under the key 'Name:' | ||||||
| GGML_SYCL_ARCH=gfx90a # Example architecture | ||||||
| cmake -B build -DGGML_SYCL=ON -DGGML_SYCL_TARGET=AMD -DGGML_SYCL_ARCH=${GGML_SYCL_ARCH} -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx | ||||||
|
|
||||||
| # build all binary | ||||||
| cmake --build build --config Release -j -v | ||||||
|
|
@@ -644,6 +647,7 @@ use 1 SYCL GPUs: [0] with Max compute units:512 | |||||
| |--------------------|---------------------------------------|---------------------------------------------| | ||||||
| | GGML_SYCL | ON (mandatory) | Enable build with SYCL code path.<br>FP32 path - recommended for better perforemance than FP16 on quantized model| | ||||||
| | GGML_SYCL_TARGET | INTEL *(default)* \| NVIDIA \| AMD | Set the SYCL target device type. | | ||||||
| | GGML_SYCL_ARCH | "" | Set the SYCL target architecture, optional except for AMD. Setting the architecture can improve the performance. See the table [here](https://github.com/intel/llvm/blob/sycl/sycl/doc/design/OffloadDesign.md#--offload-arch) for a list of valid architectures. | | ||||||
|
||||||
| | GGML_SYCL_ARCH | "" | Set the SYCL target architecture, optional except for AMD. Setting the architecture can improve the performance. See the table [here](https://github.com/intel/llvm/blob/sycl/sycl/doc/design/OffloadDesign.md#--offload-arch) for a list of valid architectures. | | |
| | GGML_SYCL_DEVICE_ARCH | Optional (except for AMD) | Set the SYCL device architecture, optional except for AMD. Setting the device architecture can improve the performance. See the table [--offload-arch](https://github.com/intel/llvm/blob/sycl/sycl/doc/design/OffloadDesign.md#--offload-arch) for a list of valid architectures. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, that's done in 5430726
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
|
|
@@ -166,6 +166,7 @@ option(GGML_SYCL "ggml: use SYCL" | |||||
| option(GGML_SYCL_F16 "ggml: use 16 bit floats for sycl calculations" OFF) | ||||||
| set (GGML_SYCL_TARGET "INTEL" CACHE STRING | ||||||
| "ggml: sycl target device") | ||||||
| set (GGML_SYCL_ARCH "" CACHE STRING "ggml: sycl architecture") | ||||||
|
||||||
| set (GGML_SYCL_ARCH "" CACHE STRING "ggml: sycl architecture") | |
| set (GGML_SYCL_DEVICE_ARCH "" CACHE STRING "ggml: sycl device architecture") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in 5430726
Uh oh!
There was an error while loading. Please reload this page.