Skip to content

Commit 81dc8c7

Browse files
committed
insert memory fence to avoid AMD OpenCL compiler reordering the read/writes
1 parent 2135633 commit 81dc8c7

File tree

1 file changed

+2
-0
lines changed

1 file changed

+2
-0
lines changed

include/boost/compute/algorithm/detail/radix_sort.hpp

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -176,8 +176,10 @@ const char radix_sort_source[] =
176176
" uint sum = 0;\n"
177177
" for(uint i = 0; i < K2_BITS; i++){\n"
178178
" uint x = global_offsets[i] + last_block_offsets[i];\n"
179+
" mem_fence(CLK_GLOBAL_MEM_FENCE);\n" // work around the RX 500/Vega bug, see #811
179180
" global_offsets[i] = sum;\n"
180181
" sum += x;\n"
182+
" mem_fence(CLK_GLOBAL_MEM_FENCE);\n" // work around the RX Vega bug, see #811
181183
" }\n"
182184
"}\n"
183185

0 commit comments

Comments
 (0)