You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: Libraries/oneDPL/pSTL_offload/README.md
+17-4Lines changed: 17 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -14,12 +14,13 @@ The `pSTL_offload` sample demonstrates the offloading of C++ standard parallel a
14
14
15
15
Offloading the C++ standard parallel STL code (par-unseq policy) to GPU and CPU without any code changes when using the `-fsycl-pstl-offload` compiler option with Intel® DPC+/C+ compiler. It is an experimental feature of oneDPL.
16
16
17
-
This folder contains two sample examples in the following folders:
17
+
This folder contains three sample examples in the following folders:
18
18
19
19
| Folder Name | Description
20
20
|:--- |:---
21
21
| `FileWordCount` | Counting Words in Files Example
22
22
| `WordCount` | Counting Words generated Example
23
+
| 'ParSTLTests' | Examples of Various STL Algorithms with Execution Policies
23
24
24
25
> **Note**: For more information refer to [Get Started with Parallel STL](https://www.intel.com/content/www/us/en/developer/articles/guide/get-started-with-parallel-stl.html).
25
26
@@ -34,8 +35,8 @@ This folder contains two sample examples in the following folders:
34
35
35
36
## Key Implementation Details
36
37
37
-
The example includes two samples `FileWordCount`and`WordCount` which count the number of words in files and the number of words generated respectively using the standard C++17 Parallel Algorithm [transfor_reduce](https://en.cppreference.com/w/cpp/algorithm/transform_reduce). This computation can be offloaded to the GPU device with the help of `-fsycl-pstl-offload` compiler option and standard <algorithm> header inclusion is explicitly required for PSTL Offload to work.
38
-
FileWordCount sample also demonstrates the use of transform, copy, copy_if, and for_each standard C++17 Parallel Algorithms.
38
+
The example includes three samples `FileWordCount`,`WordCount`and and ParSTLTests. FileWordCount and WordCount counts the number of words which count the number of words in files and the number of words generated respectively using the standard C++17 Parallel Algorithm [transfor_reduce](https://en.cppreference.com/w/cpp/algorithm/transform_reduce). ParSTLTests demonstrates the use of various STL algorithms with different execution policies (seq, par, par_unseq). It applies these algorithms to large datasets and prints the results for each execution. This computation can be offloaded to the GPU device with the help of `-fsycl-pstl-offload` compiler option and standard <algorithm> header inclusion is explicitly required for PSTL Offload to work.
39
+
FileWordCount sample also demonstrates the use of transform, copy, copy_if, and for_each standard C++17 Parallel Algorithms. . The ParSTLTests uses STL algorithms such as reduce, accumulate, find, copy_if, inclusive_scan, min_element, max_element, minmax_element, is_partitioned, lexicographical_compare, binary_search, lower_bound, and upper_bound. These algorithms perform tasks like summing elements, finding values, copying based on conditions, scanning, and searching within large datasets.
39
40
The `-fsycl-pstl-offload` option enables the offloading of C++ standard parallel algorithms that were only called with `std::execution::par_unseq` policy to a SYCL device. The offloaded algorithms are implemented via the oneAPI Data Parallel C++ Library (oneDPL). This option is an experimental feature. If the argument is not specified, the compiler offloads to the default SYCL device.
40
41
The performance of memory allocations may be improved by using the `SYCL_PI_LEVEL_ZERO_USM_ALLOCATOR` environment variable.
41
42
@@ -106,7 +107,19 @@ When working with the command-line interface (CLI), you should configure the one
106
107
$ make run_fwc1 //for PAR Policy
107
108
$ unset ONEAPI_DEVICE_SELECTOR
108
109
```
109
-
110
+
Run `pSTL_offload-ParSTLTest` on GPU.
111
+
```
112
+
$ export ONEAPI_DEVICE_SELECTOR=level_zero:gpu
113
+
$ make
114
+
$ unset ONEAPI_DEVICE_SELECTOR
115
+
```
116
+
Run `pSTL_offload-ParSTLTest` on CPU.
117
+
```
118
+
$ export ONEAPI_DEVICE_SELECTOR=*:cpu
119
+
$ make
120
+
$ unset ONEAPI_DEVICE_SELECTOR
121
+
```
122
+
110
123
#### Troubleshooting
111
124
112
125
If an error occurs, you can get more details by running `make` with the `VERBOSE=1` argument:
0 commit comments