Skip to content

Commit 04691aa

Browse files
authored
[libclc] Refine id in async_work_group_copy STRIDED_COPY (#151644)
Move id first along 0th dimension to achieve coalesced memory access when stride is 1.
1 parent 875a3de commit 04691aa

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

libclc/opencl/lib/generic/async/async_work_group_strided_copy.inc

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,8 +8,8 @@
88

99
#define STRIDED_COPY(dst, src, num_gentypes, dst_stride, src_stride) \
1010
size_t size = get_local_size(0) * get_local_size(1) * get_local_size(2); \
11-
size_t id = (get_local_size(1) * get_local_size(2) * get_local_id(0)) + \
12-
(get_local_size(2) * get_local_id(1)) + get_local_id(2); \
11+
size_t id = (get_local_size(0) * get_local_size(1) * get_local_id(2)) + \
12+
(get_local_size(0) * get_local_id(1)) + get_local_id(0); \
1313
size_t i; \
1414
\
1515
for (i = id; i < num_gentypes; i += size) { \

0 commit comments

Comments
 (0)