Skip to content

Commit 7a9233f

Browse files
committed
reduction thing
1 parent abe0608 commit 7a9233f

File tree

1 file changed

+8
-5
lines changed

1 file changed

+8
-5
lines changed

clang/include/clang/Basic/AttrDocs.td

Lines changed: 8 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1128,15 +1128,18 @@ general-purpose code.
11281128

11291129
template <typename T, uint32_t N>
11301130
constexpr T simd_reduce(T [[clang::ext_vector_type(N)]] v) {
1131-
T sum{};
1132-
for (uint32_t i = 0; i < N; ++i) {
1133-
sum += v[i];
1131+
static_assert((N & (N - 1)) == 0, "N must be a power of two");
1132+
if constexpr (N == 1) {
1133+
return v[0];
1134+
} else {
1135+
T [[clang::ext_vector_type(N / 2)]] reduced = v.hi + v.lo;
1136+
return simd_reduce<T, N / 2>(reduced);
11341137
}
1135-
return sum;
11361138
}
11371139

11381140
The vector type also supports swizzling up to sixteen elements. This can be done
1139-
using the object accessors.
1141+
using the object accessors. The OpenCL documentation lists the full list of
1142+
accepted values.
11401143
.. code-block:: c++
11411144

11421145
using f16_x16 = _Float16 __attribute__((ext_vector_type(16)));

0 commit comments

Comments
 (0)