Skip to content

Commit 04ed90f

Browse files
authored
[SYCL][Doc] Add proposed range_type extension (#15962)
This extension proposes a new kernel property that allows developers to declare the range requirements of individual kernels, providing more fine-grained control than existing compiler options and improved error behavior. --------- Signed-off-by: John Pennycook <[email protected]>
1 parent b37d0d4 commit 04ed90f

File tree

1 file changed

+211
-0
lines changed

1 file changed

+211
-0
lines changed
Lines changed: 211 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,211 @@
1+
= sycl_ext_oneapi_range_type
2+
3+
:source-highlighter: coderay
4+
:coderay-linenums-mode: table
5+
6+
// This section needs to be after the document title.
7+
:doctype: book
8+
:toc2:
9+
:toc: left
10+
:encoding: utf-8
11+
:lang: en
12+
:dpcpp: pass:[DPC++]
13+
:endnote: &#8212;{nbsp}end{nbsp}note
14+
15+
// Set the default source code type in this document to C++,
16+
// for syntax highlighting purposes. This is needed because
17+
// docbook uses c++ and html5 uses cpp.
18+
:language: {basebackend@docbook:c++:cpp}
19+
20+
21+
== Notice
22+
23+
[%hardbreaks]
24+
Copyright (C) 2024 Intel Corporation. All rights reserved.
25+
26+
Khronos(R) is a registered trademark and SYCL(TM) and SPIR(TM) are trademarks
27+
of The Khronos Group Inc. OpenCL(TM) is a trademark of Apple Inc. used by
28+
permission by Khronos.
29+
30+
31+
== Contact
32+
33+
To report problems with this extension, please open a new issue at:
34+
35+
https://github.com/intel/llvm/issues
36+
37+
38+
== Dependencies
39+
40+
This extension is written against the SYCL 2020 revision 9 specification. All
41+
references below to the "core SYCL specification" or to section numbers in the
42+
SYCL specification refer to that revision.
43+
44+
This extension also depends on the following other SYCL extensions:
45+
46+
* link:../experimental/sycl_ext_oneapi_kernel_properties.asciidoc[
47+
sycl_ext_oneapi_kernel_properties]
48+
49+
50+
== Status
51+
52+
This is a proposed extension specification, intended to gather community
53+
feedback. Interfaces defined in this specification may not be implemented yet
54+
or may be in a preliminary state. The specification itself may also change in
55+
incompatible ways before it is finalized. *Shipping software products should
56+
not rely on APIs defined in this specification.*
57+
58+
59+
== Overview
60+
61+
The maximum number of work-items that can be launched in a single kernel
62+
depends on multiple factors.
63+
SYCL 2020 says that the total number of work-items must be representable as a
64+
`size_t`, but several implementations (including {dpcpp}) provide optimization
65+
options to assert that kernels will not require the full range of a `size_t`.
66+
67+
This extension proposes a new kernel property that allows developers to declare
68+
the range requirements of individual kernels, providing more fine-grained
69+
control than existing compiler options and improved error behavior.
70+
71+
The property described in this extension is an advanced feature that most
72+
applications should not need to use.
73+
In most cases, applications get the best performance without using this
74+
property.
75+
76+
77+
== Specification
78+
79+
=== Feature test macro
80+
81+
This extension provides a feature-test macro as described in the core SYCL
82+
specification. An implementation supporting this extension must predefine the
83+
macro `SYCL_EXT_ONEAPI_RANGE_TYPE` to one of the values defined in the table
84+
below. Applications can test for the existence of this macro to determine if
85+
the implementation supports this feature, or applications can test the macro's
86+
value to determine which of the extension's features the implementation
87+
supports.
88+
89+
[%header,cols="1,5"]
90+
|===
91+
|Value
92+
|Description
93+
94+
|1
95+
|The APIs of this experimental extension are not versioned, so the
96+
feature-test macro always has this value.
97+
|===
98+
99+
=== New property
100+
101+
```c++
102+
namespace sycl::ext::oneapi::experimental {
103+
104+
struct range_type_key {
105+
template <typename T>
106+
using value_t = property_value<range_type_key, T>;
107+
};
108+
109+
template <typename T>
110+
inline constexpr range_type_key::value_t<T> range_type;
111+
112+
} // namespace sycl::ext::oneapi::experimental
113+
```
114+
115+
|===
116+
|Property|Description
117+
118+
|`range_type`
119+
|The `range_type` property is an assertion by the application that the kernel
120+
will never be launched with more than `std::numeric_limits<T>::max()`
121+
work-items.
122+
If the kernel is launched with more than this many work-items, the
123+
implementation must throw a synchronous `exception` with the `errc::nd_range`
124+
error code.
125+
126+
`T` must be an integral type.
127+
128+
|===
129+
130+
This property can also be associated with a device function using the
131+
`SYCL_EXT_ONEAPI_FUNCTION_PROPERTY` macro.
132+
133+
There are special requirements whenever a device function defined in one
134+
translation unit makes a call to a device function that is defined in a second
135+
translation unit.
136+
In such a case, the second device function is always declared using
137+
`SYCL_EXTERNAL`.
138+
If the kernel calling these device functions is defined using a `range_type`
139+
property, the functions declared using `SYCL_EXTERNAL` must be similarly
140+
decorated to ensure that a compatible `range_type` is used.
141+
This decoration must exist in both the translation unit making the call and
142+
also in the translation unit that defines the function.
143+
If the `range_type` property is missing in the translation unit that makes the
144+
call, or if the `range_type` of the called function is not compatible with the
145+
`range_type` of the calling function, the program is ill-formed and the
146+
compiler must raise a diagnostic.
147+
Two `range_type` properties are considered compatible if all values that can be
148+
represented by the `range_type` of the caller function can be represented by
149+
the `range_type` of the called function.
150+
151+
== Usage example
152+
153+
```c++
154+
namespace syclex = sycl::ext::oneapi::experimental;
155+
156+
struct SmallKernel
157+
{
158+
// Declare that this kernel supports at most 2^31-1 work-items.
159+
auto get(syclex::properties_tag) const {
160+
return syclex::properties{syclex::range_type<int>};
161+
}
162+
};
163+
164+
...
165+
166+
// Throws an exception with errc::nd_range error code.
167+
// (because 2147483648 > 2147483647)
168+
q.parallel_for(2147483648, SmallKernel());
169+
```
170+
171+
== Interaction with the {dpcpp} "-fsycl-id-queries-fit-in-int" option
172+
173+
The `-fsycl-id-queries-fit-in-int` option is specific to the {dpcpp}
174+
implementation.
175+
Therefore, this section that describes the interaction between this extension
176+
and that option is non-normative and does not apply to other SYCL
177+
implementations that may support this extension.
178+
179+
If a translation unit is compiled with the `-fsycl-id-queries-fit-in-int`
180+
option, all kernels and `SYCL_EXTERNAL` functions without an explicitly
181+
specified `range_type` property are compiled as-if `range_type<int>` was
182+
specified as a property of that kernel or function.
183+
184+
185+
== Implementation notes
186+
187+
This non-normative section provides information about one possible
188+
implementation of this extension.
189+
It is not part of the specification of the extension's API.
190+
191+
There are several ways this extension could be implemented:
192+
193+
- A dedicated LLVM-IR pass could recognize functions that compute an ID (e.g.,
194+
`nd_item::get_global_id`), check whether the static call tree comes from a
195+
kernel or function decorated with `range_type`, and then conditionally insert
196+
an equivalent of `__builtin_assume`.
197+
198+
- The existing `+__builtin_assume+` intrinsic in `+__SYCL_ASSUME_INT+` could be
199+
replaced by a new (but similar) SPIR-V intrinsic, shifting range-type
200+
optimizations to the device compiler.
201+
This could take the form of a dedicated `+__spirv_AssumeInRange+` intrinsic,
202+
or a general `+__spirv_Assume+` intrinsic coupled with a `+__spirv_MaxRange+`
203+
value.
204+
205+
Ideally, the solution should not generate SPIR-V modules that require new
206+
extensions if `range_type` is not used.
207+
208+
209+
== Issues
210+
211+
None.

0 commit comments

Comments
 (0)