You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
message(FATAL_ERROR "PLSSVM_OPENMP_BLOCK_SIZE must be an integer greater than 0 but is \"${PLSSVM_OPENMP_BLOCK_SIZE}\"!")
379
+
endif()
380
+
endif()
381
+
360
382
## change executable floating points from double precision to single precision
361
383
option(PLSSVM_EXECUTABLES_USE_SINGLE_PRECISION "Build the svm-train and svm-predict executables with single precision instead of double precision."OFF)
Copy file name to clipboardExpand all lines: README.md
+41-40Lines changed: 41 additions & 40 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -3,38 +3,38 @@
3
3
4
4
Implementation of a parallel [least-squares support-vector machine](https://en.wikipedia.org/wiki/Least-squares_support-vector_machine) using multiple different backends.
5
5
The currently available backends are:
6
-
-[OpenMP](https://www.openmp.org/)
7
-
-[CUDA](https://developer.nvidia.com/cuda-zone)
8
-
-[OpenCL](https://www.khronos.org/opencl/)
9
-
-[SYCL](https://www.khronos.org/sycl/)
6
+
-[OpenMP](https://www.openmp.org/)
7
+
-[CUDA](https://developer.nvidia.com/cuda-zone)
8
+
-[OpenCL](https://www.khronos.org/opencl/)
9
+
-[SYCL](https://www.khronos.org/sycl/)
10
10
11
11
## Getting Started
12
12
13
13
### Dependencies
14
14
15
-
General dependencies:
16
-
- a C++17 capable compiler (e.g. [`gcc`](https://gcc.gnu.org/) or [`clang`](https://clang.llvm.org/))
17
-
-[CMake](https://cmake.org/) 3.18 or newer
18
-
-[cxxopts](https://github.com/jarro2783/cxxopts), [fast_float](https://github.com/fastfloat/fast_float) and [{fmt}](https://github.com/fmtlib/fmt) (all three are automatically build during the CMake configuration if they couldn't be found using the respective `find_package` call)
19
-
-[GoogleTest](https://github.com/google/googletest) if testing is enabled (automatically build during the CMake configuration if `find_package(GTest)` wasn't successful)
20
-
-[doxygen](https://www.doxygen.nl/index.html) if documentation generation is enabled
21
-
-[OpenMP](https://www.openmp.org/) 4.0 or newer (optional) to speed-up file parsing
15
+
General dependencies:
16
+
- a C++17 capable compiler (e.g. [`gcc`](https://gcc.gnu.org/) or [`clang`](https://clang.llvm.org/))
17
+
-[CMake](https://cmake.org/) 3.18 or newer
18
+
-[cxxopts](https://github.com/jarro2783/cxxopts), [fast_float](https://github.com/fastfloat/fast_float) and [{fmt}](https://github.com/fmtlib/fmt) (all three are automatically build during the CMake configuration if they couldn't be found using the respective `find_package` call)
19
+
-[GoogleTest](https://github.com/google/googletest) if testing is enabled (automatically build during the CMake configuration if `find_package(GTest)` wasn't successful)
20
+
-[doxygen](https://www.doxygen.nl/index.html) if documentation generation is enabled
21
+
-[OpenMP](https://www.openmp.org/) 4.0 or newer (optional) to speed-up file parsing
22
22
23
23
Additional dependencies for the OpenMP backend:
24
-
- compiler with OpenMP support
24
+
- compiler with OpenMP support
25
25
26
26
Additional dependencies for the CUDA backend:
27
-
- CUDA SDK
28
-
- either NVIDIA [`nvcc`](https://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/index.html) or [`clang` with CUDA support enabled](https://llvm.org/docs/CompileCudaWithLLVM.html)
27
+
- CUDA SDK
28
+
- either NVIDIA [`nvcc`](https://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/index.html) or [`clang` with CUDA support enabled](https://llvm.org/docs/CompileCudaWithLLVM.html)
29
29
30
30
Additional dependencies for the OpenCL backend:
31
-
- OpenCL runtime and header files
31
+
- OpenCL runtime and header files
32
32
33
33
Additional dependencies for the SYCL backend:
34
-
- the code must be compiled with a SYCL capable compiler; currently tested with [DPC++](https://github.com/intel/llvm) and [hipSYCL](https://github.com/illuhad/hipSYCL)
34
+
- the code must be compiled with a SYCL capable compiler; currently tested with [DPC++](https://github.com/intel/llvm) and [hipSYCL](https://github.com/illuhad/hipSYCL)
35
35
36
36
Additional dependencies if `PLSSVM_ENABLE_TESTING` and `PLSSVM_GENERATE_TEST_FILE` are both set to `ON`:
37
-
-[Python3](https://www.python.org/) with the [`argparse`](https://docs.python.org/3/library/argparse.html) and [`sklearn`](https://scikit-learn.org/stable/) modules
37
+
-[Python3](https://www.python.org/) with the [`argparse`](https://docs.python.org/3/library/argparse.html) and [`sklearn`](https://scikit-learn.org/stable/) modules
38
38
39
39
### Building
40
40
@@ -52,10 +52,10 @@ Building the library can be done using the normal CMake approach:
52
52
53
53
The **required** CMake option `PLSSVM_TARGET_PLATFORMS` is used to determine for which targets the backends should be compiled.
54
54
Valid targets are:
55
-
-`cpu`: compile for the CPU; **no** architectural specifications is allowed
56
-
-`nvidia`: compile for NVIDIA GPUs; **at least one** architectural specification is necessary, e.g. `nvidia:sm_86,sm_70`
57
-
-`amd`: compile for AMD GPUs; **at least one** architectural specification is necessary, e.g. `amd:gfx906`
58
-
-`intel`: compile for Intel GPUs; **no** architectural specification is allowed
55
+
-`cpu`: compile for the CPU; **no** architectural specifications is allowed
56
+
-`nvidia`: compile for NVIDIA GPUs; **at least one** architectural specification is necessary, e.g. `nvidia:sm_86,sm_70`
57
+
-`amd`: compile for AMD GPUs; **at least one** architectural specification is necessary, e.g. `amd:gfx906`
58
+
-`intel`: compile for Intel GPUs; **no** architectural specification is allowed
59
59
60
60
At least one of the above targets must be present.
-`ON`: check for the SYCL backend and fail if not available
108
+
-`AUTO`: check for the SYCL backend but **do not** fail if not available
109
+
-`OFF`: do not check for the SYCL backend
110
110
111
111
**Attention:** at least one backend must be enabled and available!
112
112
113
-
-`PLSSVM_ENABLE_ASSERTS=ON|OFF` (default: `OFF`): enables custom assertions regardless whether the `DEBUG` macro is defined or not
114
-
-`PLSSVM_THREAD_BLOCK_SIZE` (default: `16`): set a specific thread block size used in the GPU kernels (for fine-tuning optimizations)
115
-
-`PLSSVM_INTERNAL_BLOCK_SIZE` (default: `6`: set a specific internal block size used in the GPU kernels (for fine-tuning optimizations)
116
-
-`PLSSVM_EXECUTABLES_USE_SINGLE_PRECISION` (default: `OFF`): enables single precision calculations instead of double precision for the `svm-train` and `svm-predict` executables
117
-
-`PLSSVM_ENABLE_LTO=ON|OFF` (default: `ON`): enable interprocedural optimization (IPO/LTO) if supported by the compiler
118
-
-`PLSSVM_ENABLE_DOCUMENTATION=ON|OFF` (default: `OFF`): enable the `doc` target using doxygen
119
-
-`PLSSVM_ENABLE_TESTING=ON|OFF` (default: ON): enable testing using GoogleTest and ctest
113
+
-`PLSSVM_ENABLE_ASSERTS=ON|OFF` (default: `OFF`): enables custom assertions regardless whether the `DEBUG` macro is defined or not
114
+
-`PLSSVM_THREAD_BLOCK_SIZE` (default: `16`): set a specific thread block size used in the GPU kernels (for fine-tuning optimizations)
115
+
-`PLSSVM_INTERNAL_BLOCK_SIZE` (default: `6`: set a specific internal block size used in the GPU kernels (for fine-tuning optimizations)
116
+
-`PLSSVM_EXECUTABLES_USE_SINGLE_PRECISION` (default: `OFF`): enables single precision calculations instead of double precision for the `svm-train` and `svm-predict` executables
117
+
-`PLSSVM_ENABLE_LTO=ON|OFF` (default: `ON`): enable interprocedural optimization (IPO/LTO) if supported by the compiler
118
+
-`PLSSVM_ENABLE_DOCUMENTATION=ON|OFF` (default: `OFF`): enable the `doc` target using doxygen
119
+
-`PLSSVM_ENABLE_TESTING=ON|OFF` (default: `ON`): enable testing using GoogleTest and ctest
120
+
-`PLSSVM_GENERATE_TIMING_SCRIPT=ON|OFF` (default: `OFF`): configure a timing script usable for performance measurement
120
121
121
122
If `PLSSVM_ENABLE_TESTING` is set to `ON`, the following options can also be set:
122
-
-`PLSSVM_GENERATE_TEST_FILE=ON|OFF` (default: `ON`): automatically generate test files
123
+
-`PLSSVM_GENERATE_TEST_FILE=ON|OFF` (default: `ON`): automatically generate test files
123
124
-`PLSSVM_TEST_FILE_NUM_DATA_POINTS` (default: `5000`): the number of data points in the test file
124
125
-`PLSSVM_TEST_FILE_NUM_FEATURES` (default: `2000`): the number of features per data point
0 commit comments