Skip to content

Commit 0478174

Browse files
[SYCL] Updated SYCL device filtering (ggml-org#8901)
* Updated device filter to depend on default_selector (fixes non-intel device issues) * Small related update to example/sycl Readme
1 parent a8dbc6f commit 0478174

File tree

2 files changed

+25
-18
lines changed

2 files changed

+25
-18
lines changed

examples/sycl/README.md

Lines changed: 9 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -12,9 +12,9 @@ This example program provides the tools for llama.cpp for SYCL on Intel GPU.
1212

1313
List all SYCL devices with ID, compute capability, max work group size, ect.
1414

15-
1. Build the llama.cpp for SYCL for all targets.
15+
1. Build the llama.cpp for SYCL for the specified target *(using GGML_SYCL_TARGET)*.
1616

17-
2. Enable oneAPI running environment
17+
2. Enable oneAPI running environment *(if GGML_SYCL_TARGET is set to INTEL -default-)*
1818

1919
```
2020
source /opt/intel/oneapi/setvars.sh
@@ -29,19 +29,13 @@ source /opt/intel/oneapi/setvars.sh
2929
Check the ID in startup log, like:
3030

3131
```
32-
found 4 SYCL devices:
33-
Device 0: Intel(R) Arc(TM) A770 Graphics, compute capability 1.3,
34-
max compute_units 512, max work group size 1024, max sub group size 32, global mem size 16225243136
35-
Device 1: Intel(R) FPGA Emulation Device, compute capability 1.2,
36-
max compute_units 24, max work group size 67108864, max sub group size 64, global mem size 67065057280
37-
Device 2: 13th Gen Intel(R) Core(TM) i7-13700K, compute capability 3.0,
38-
max compute_units 24, max work group size 8192, max sub group size 64, global mem size 67065057280
39-
Device 3: Intel(R) Arc(TM) A770 Graphics, compute capability 3.0,
40-
max compute_units 512, max work group size 1024, max sub group size 32, global mem size 16225243136
32+
found 2 SYCL devices:
33+
| | | | |Max | |Max |Global | |
34+
| | | | |compute|Max work|sub |mem | |
35+
|ID| Device Type| Name|Version|units |group |group|size | Driver version|
36+
|--|-------------------|---------------------------------------|-------|-------|--------|-----|-------|---------------------|
37+
| 0| [level_zero:gpu:0]| Intel Arc A770 Graphics| 1.3| 512| 1024| 32| 16225M| 1.3.29138|
38+
| 1| [level_zero:gpu:1]| Intel UHD Graphics 750| 1.3| 32| 512| 32| 62631M| 1.3.29138|
4139
4240
```
4341

44-
|Attribute|Note|
45-
|-|-|
46-
|compute capability 1.3|Level-zero running time, recommended |
47-
|compute capability 3.0|OpenCL running time, slower than level-zero in most cases|

ggml/src/ggml-sycl/dpct/helper.hpp

Lines changed: 16 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -874,7 +874,7 @@ namespace dpct
874874
inline std::string get_preferred_gpu_platform_name() {
875875
std::string result;
876876
877-
std::string filter = "level-zero";
877+
std::string filter = "";
878878
char* env = getenv("ONEAPI_DEVICE_SELECTOR");
879879
if (env) {
880880
if (std::strstr(env, "level_zero")) {
@@ -892,11 +892,24 @@ namespace dpct
892892
else {
893893
throw std::runtime_error("invalid device filter: " + std::string(env));
894894
}
895+
} else {
896+
auto default_device = sycl::device(sycl::default_selector_v);
897+
auto default_platform_name = default_device.get_platform().get_info<sycl::info::platform::name>();
898+
899+
if (std::strstr(default_platform_name.c_str(), "Level-Zero") || default_device.is_cpu()) {
900+
filter = "level-zero";
901+
}
902+
else if (std::strstr(default_platform_name.c_str(), "CUDA")) {
903+
filter = "cuda";
904+
}
905+
else if (std::strstr(default_platform_name.c_str(), "HIP")) {
906+
filter = "hip";
907+
}
895908
}
896909
897-
auto plaform_list = sycl::platform::get_platforms();
910+
auto platform_list = sycl::platform::get_platforms();
898911
899-
for (const auto& platform : plaform_list) {
912+
for (const auto& platform : platform_list) {
900913
auto devices = platform.get_devices();
901914
auto gpu_dev = std::find_if(devices.begin(), devices.end(), [](const sycl::device& d) {
902915
return d.is_gpu();

0 commit comments

Comments
 (0)