Skip to content

Commit 9b33ad0

Browse files
[SYCL] Clarify sub-group size calculation in invoke_simd spec. (#6587)
* [SYCL] Clarify sub-group size calculation in invoke_simd spec. Signed-off-by: Konstantin S Bobrovsky <[email protected]> Co-authored-by: rolandschulz <[email protected]>
1 parent e8e7ae8 commit 9b33ad0

File tree

1 file changed

+35
-4
lines changed

1 file changed

+35
-4
lines changed

sycl/doc/extensions/experimental/sycl_ext_oneapi_invoke_simd.asciidoc

Lines changed: 35 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -123,10 +123,15 @@ of a `simd<T, N>` type (e.g. using multiple registers rather than an array).
123123
We expect implementations of this ABI to build on Clang's `ext_vector_type`,
124124
rather than specifying the register types directly.
125125

126-
The value of `N` for each `simd` argument to a SIMD function must be the same,
127-
and represents the sub-group size of the calling kernel. If a function accepts
128-
`simd` arguments with multiple `N` values, it cannot be invoked from a SPMD
129-
kernel and the implementation must issue a diagnostic in this case.
126+
Each of the following must hold true for a SIMD function and its invoke_simd
127+
call site:
128+
- the value of `N` of each `simd` or `simd_mask` formal parameter corresponding to a non-uniform
129+
SPMD actual argument (if any) must be the same
130+
- if the return type of the SIMD function is `simd<T, M>` or `simd_mask<T, M>`,
131+
then `M` must match `N` above
132+
Otherwise, the SIMD function cannot be invoked from an SPMD kernel and the
133+
implementation must issue a diagnostic in this case.
134+
The `N` value represents the sub-group size of the calling kernel.
130135

131136
The example below shows a simple SIMD function that scales all elements of
132137
an 8-wide SIMD type `x` by a scalar value `n`.
@@ -244,6 +249,12 @@ rules in reverse:
244249
are converted to `T`, and the value in element `i` of the SIMD type is
245250
returned to the work-item with sub-group local ID `i`.
246251

252+
- Return values of type `sycl::ext::oneapi::experimental::uniform<T>` are not anyhow converted,
253+
and broadcast to each work-item; every work-item in the sub-group receives
254+
the same value.
255+
NOTE: `sycl::ext::oneapi::experimental::uniform<T>` return type is the way to return
256+
a uniform value of `simd` or `simd_mask` type.
257+
247258
- Return values of type `T` are converted to `sycl::ext::oneapi::experimental::uniform<T>`,
248259
and broadcast to each work-item; every work-item in the sub-group receives
249260
the same value.
@@ -263,6 +274,25 @@ variable should be passed via a dedicated argument (e.g. the value returned
263274
by `sub_group::get_local_id()[0]` could be passed as an integer to a
264275
`Callable` expecting a `sycl::ext::oneapi::experimental::simd<uint32_t, N>`).
265276

277+
NOTE: An implementation must be able to determine the sub-group size of the SIMD
278+
function object to perform argument type conversion.
279+
The determined subgroup size must be equal to the value of the caller's `sycl::reqd_sub_group_size`
280+
attribute (if present) and consistent with the argument conversion rules defined
281+
by this specification. Otherwise, the implementation must issue a proper diagnostic.
282+
283+
The following algorithm to determine the sub-group size may be employed:
284+
- If any of the actual SPMD arguments is non-uniform, the specification requires
285+
that the corresponding SIMD argument must have `simd<T, N>` or `simd_mask<bool, N>`
286+
type, and `N` must be equal to the the sub-group size. The implementation may
287+
iterate through the supported sub-group sizes at compile time and check whether the
288+
function object is invocable with given sub-group size. There must be
289+
exactly one such sub-group size and it must match `N` or the user program is not
290+
well-formed.
291+
- Otherwise, if the return type of the SIMD target function is `simd<T, N>` or
292+
`simd_mask<bool, N>`, then the sub-group size is `N`.
293+
- Otherwise (all arguments and the return type are uniform) the SIMD target is
294+
fully uniform and it can be invoked with any sub-group size.
295+
266296
The `invoke_simd` function has the same requirements as other group functions
267297
(as defined in Section 4.17.3 of the SYCL 2020 specification). A call to
268298
`invoke_simd` must be encountered in converged control flow by all work-items
@@ -447,4 +477,5 @@ SYCL 2020 specification states that this only applies to inter-device transfers.
447477
|2|2021-03-31|John Pennycook|*Rename extension and add feature test macro*
448478
|3|2021-04-23|John Pennycook|*Split uniform wrapper into separate extension*
449479
|4|2022-01-20|John Pennycook|*Clarify interaction with SYCL_EXTERNAL*
480+
|5|2022-08-15|Konst Bobrovskii|*Clarify sub-group size calculation*
450481
|========================================

0 commit comments

Comments
 (0)