Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 5 additions & 6 deletions libclc/opencl/include/clc/opencl/synchronization/utils.h
Original file line number Diff line number Diff line change
Expand Up @@ -13,16 +13,15 @@
#include <clc/mem_fence/clc_mem_semantic.h>
#include <clc/opencl/synchronization/cl_mem_fence_flags.h>

_CLC_INLINE int __opencl_get_memory_scope(cl_mem_fence_flags flag) {
int memory_scope = 0;
static _CLC_INLINE int __opencl_get_memory_scope(cl_mem_fence_flags flag) {
if (flag & CLK_GLOBAL_MEM_FENCE)
memory_scope |= __MEMORY_SCOPE_DEVICE;
return __MEMORY_SCOPE_DEVICE;
if (flag & CLK_LOCAL_MEM_FENCE)
memory_scope |= __MEMORY_SCOPE_WRKGRP;
return memory_scope;
return __MEMORY_SCOPE_WRKGRP;
return __MEMORY_SCOPE_SINGLE;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure if the default value is appropriate. Should we call __builtin_unreachable(); like in __opencl_get_memory_semantics?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the OpenCL C spec, the synchronization functions allow the 0 to be passed as a valid argument value. When 0 is passed, the single-thread fence looks reasonable to me, because the implementation shouldn't issue any cross-thread memory synchronization in that case. The __builtin_unreachable() won't work correctly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @vmustya

}

_CLC_INLINE __CLC_MemorySemantics
static _CLC_INLINE __CLC_MemorySemantics
__opencl_get_memory_semantics(cl_mem_fence_flags flag) {
if ((flag & CLK_LOCAL_MEM_FENCE) && (flag & CLK_GLOBAL_MEM_FENCE))
return __CLC_MEMORY_LOCAL | __CLC_MEMORY_GLOBAL;
Expand Down
Loading