You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
-[SYCL](https://www.khronos.org/sycl/) (supported implementations are [DPC++](https://github.com/intel/llvm) and [AdaptiveCpp](https://github.com/AdaptiveCpp/AdaptiveCpp) (formerly known as hipSYCL); specifically the versions [sycl-nightly/20231201](https://github.com/intel/llvm/tree/sycl-nightly/20230110) and AdaptiveCpp release [v24.06.0](https://github.com/AdaptiveCpp/AdaptiveCpp/releases/tag/v23.10.0))
68
+
-[Kokkos](https://github.com/kokkos/kokkos) (all execution spaces supported except `OpenMPTarget` and `OpenACC`); specifically the version [4.5.00](https://github.com/kokkos/kokkos/releases/tag/4.5.00)
68
69
3. Six different kernel functions to be able to classify a large variety of different problems:
@@ -128,6 +129,10 @@ Additional dependencies for the SYCL backend:
128
129
129
130
- the code must be compiled with a SYCL capable compiler; currently supported are [DPC++](https://github.com/intel/llvm) and [AdaptiveCpp](https://github.com/AdaptiveCpp/AdaptiveCpp)
130
131
132
+
Additional dependencies for the Kokkos backend:
133
+
134
+
- a Kokkos installation with the respective execution spaces enabled; currently all execution spaces are supported except `OpenMPTarget` and `OpenACC`
135
+
131
136
Additional dependencies for the stdpar backend:
132
137
133
138
- the code must be compiled with a stdpar capable compiler; currently supported are [nvc++](https://developer.nvidia.com/hpc-sdk), [roc-stdpar](https://github.com/ROCm/roc-stdpar), [icpx](https://www.intel.com/content/www/us/en/developer/tools/oneapi/dpc-compiler.html), [AdaptiveCpp](https://github.com/AdaptiveCpp/AdaptiveCpp), and [GNU GCC](https://gcc.gnu.org/))
@@ -243,6 +248,11 @@ The `[optional_options]` can be one or multiple of:
243
248
-`AUTO`: check for the OpenMP backend but **do not** fail if not available
-`ON`: check for the Kokkos backend and fail if not available
283
+
-`AUTO`: check for the Kokkos backend but **do not** fail if not available
284
+
-`OFF`: do not check for the Kokkos backend
285
+
271
286
**Attention:** at least one backend must be enabled and available!
272
287
273
288
-`PLSSVM_ENABLE_FAST_MATH=ON|OFF` (default depending on `CMAKE_BUILD_TYPE`: `ON` for Release or RelWithDebInfo, `OFF` otherwise): enable `fast-math` compiler flags for all backends
@@ -344,6 +359,10 @@ If more than one SYCL implementation is available the environment variables `PLS
344
359
345
360
-`PLSSVM_SYCL_BACKEND_PREFERRED_IMPLEMENTATION` (`dpcpp`|`adaptivecpp`): specify the preferred SYCL implementation if the `sycl_implementation_type` option is set to `automatic`; additional the specified SYCL implementation is used in the `plssvm::sycl` namespace, the other implementations are available in the `plssvm::dpcpp` and `plssvm::adaptivecpp` namespace respectively
346
361
362
+
If the Kokkos backend is available the following additional option is available (**note**: this option takes only effect if the Kokkos SYCL execution space is available):
363
+
364
+
-`PLSSVM_KOKKOS_BACKEND_INTEL_LLVM_ENABLE_AOT` (default: `ON`): enable Ahead-of-Time (AOT) compilation for the specified target platforms
365
+
347
366
If the stdpar backend is available, an additional options can be set.
348
367
349
368
-`PLSSVM_STDPAR_BACKEND_IMPLEMENTATION` (default: `AUTO`): explicitly specify the used stdpar implementation; must be one of: `AUTO`, `NVHPC`, `roc-stdpar`, `IntelLLVM`, `ACPP`, `GNU_TBB`.
"all_python" - All available backends + Python bindings
423
+
"all_test" - All available backends tests
402
424
```
403
425
404
426
With these presets, building and testing, e.g., our CUDA backend is as simple as typing (in the PLSSVM root directory):
@@ -553,12 +575,14 @@ Usage:
553
575
-i, --max_iter arg set the maximum number of CG iterations (default: num_features)
554
576
-l, --solver arg choose the solver: automatic|cg_explicit|cg_implicit (default: automatic)
555
577
-a, --classification arg the classification strategy to use for multi-class classification: oaa|oao (default: oaa)
556
-
-b, --backend arg choose the backend: automatic|openmp|hpx|cuda|hip|opencl|sycl|stdpar (default: automatic)
578
+
-b, --backend arg choose the backend: automatic|openmp|hpx|cuda|hip|opencl|sycl|kokkos|stdpar (default: automatic)
557
579
-p, --target_platform arg choose the target platform: automatic|cpu|gpu_nvidia|gpu_amd|gpu_intel (default: automatic)
558
580
--sycl_kernel_invocation_type arg
559
581
choose the kernel invocation type when using SYCL as backend: automatic|nd_range (default: automatic)
560
582
--sycl_implementation_type arg
561
583
choose the SYCL implementation to be used in the SYCL backend: automatic|dpcpp|adaptivecpp (default: automatic)
584
+
--kokkos_execution_space arg
585
+
choose the Kokkos execution space to be used in the Kokkos backend: automatic|Cuda|OpenMP|Serial (default: automatic)
562
586
--performance_tracking arg
563
587
the output YAML file where the performance tracking results are written to; if not provided, the results are dumped to stderr
564
588
--use_strings_as_labels use strings as labels instead of plane numbers
@@ -594,10 +618,10 @@ Another example targeting NVIDIA GPUs using the SYCL backend looks like:
594
618
595
619
The `--backend=automatic` option works as follows:
596
620
597
-
- if the `gpu_nvidia` target is available, check for existing backends in order `cuda` 🠦 `hip` 🠦 `opencl` 🠦 `sycl` 🠦 `stdpar`
598
-
- otherwise, if the `gpu_amd` target is available, check for existing backends in order `hip` 🠦 `opencl` 🠦 `sycl` 🠦 `stdpar`
599
-
- otherwise, if the `gpu_intel` target is available, check for existing backends in order `sycl` 🠦 `opencl` 🠦 `stdpar`
600
-
- otherwise, if the `cpu` target is available, check for existing backends in order `sycl` 🠦 `opencl` 🠦 `openmp` 🠦 `hpx` 🠦 `stdpar`
621
+
- if the `gpu_nvidia` target is available, check for existing backends in order `cuda` 🠦 `hip` 🠦 `opencl` 🠦 `sycl` 🠦 `kokkos` 🠦 `stdpar`
622
+
- otherwise, if the `gpu_amd` target is available, check for existing backends in order `hip` 🠦 `opencl` 🠦 `sycl` 🠦 `kokkos` 🠦 `stdpar`
623
+
- otherwise, if the `gpu_intel` target is available, check for existing backends in order `sycl` 🠦 `opencl` 🠦 `kokkos` 🠦 `stdpar`
624
+
- otherwise, if the `cpu` target is available, check for existing backends in order `sycl` 🠦 `kokkos` 🠦 `opencl` 🠦 `openmp` 🠦 `hpx` 🠦 `stdpar`
601
625
602
626
Note that during CMake configuration it is guaranteed that at least one of the above combinations does exist.
603
627
@@ -609,11 +633,13 @@ The `--target_platform=automatic` option works for the different backends as fol
609
633
-`HIP`: always selects an AMD GPU (if no AMD GPU is available, throws an exception)
610
634
-`OpenCL`: tries to find available devices in the following order: NVIDIA GPUs 🠦 AMD GPUs 🠦 Intel GPUs 🠦 CPU
611
635
-`SYCL`: tries to find available devices in the following order: NVIDIA GPUs 🠦 AMD GPUs 🠦 Intel GPUs 🠦 CPU
636
+
-`Kokkos`: checks which execution spaces are available and which target platforms they support and then tries to find available devices in the following order: NVIDIA GPUs 🠦 AMD GPUs 🠦 Intel GPUs 🠦 CPU
612
637
-`stdpar`: target device must be selected at compile time (using `PLSSVM_TARGET_PLATFORMS`) or using environment variables at runtime
613
638
614
639
The `--sycl_kernel_invocation_type` and `--sycl_implementation_type` flags are only used if the `--backend` is `sycl`, otherwise a warning is emitted on `stderr`.
615
640
If the `--sycl_kernel_invocation_type` is `automatic`, the `nd_range` invocation type is currently always used.
616
641
If the `--sycl_implementation_type` is `automatic`, the used SYCL implementation is determined by the `PLSSVM_SYCL_BACKEND_PREFERRED_IMPLEMENTATION` CMake flag.
642
+
If the `--kokkos_execution_space` is `automatic`, uses the best fitting execution space based on the provided and/or available target platforms.
617
643
618
644
### Predicting using `plssvm-predict`
619
645
@@ -628,10 +654,12 @@ LS-SVM with multiple (GPU-)backends
0 commit comments