Skip to content

Conversation

@vmustya
Copy link
Contributor

@vmustya vmustya commented Dec 3, 2025

The function __opencl_get_memory_scope incorrectly assumed that the
Clang built-in __MEMORY_SCOPE_* macros defined as bitmasks, while they
are actually defined as distinct integer values. This led to incorrect
mapping of OpenCL memory fence flags to LLVM memory scopes, causing
issues in generated code.

The fix involves updating the __opencl_get_memory_scope function to
return the correct __MEMORY_SCOPE_* values based on the provided
cl_mem_fence_flags. Additionally, the __opencl_get_memory_semantics
and the __opencl_get_memory_scope functions are marked as static
to avoid potential multiple definition issues during linking.

The function `__opencl_get_memory_scope` incorrectly assumed that the
Clang built-in `__MEMORY_SCOPE_*` macros defined as bitmasks, while they
are actually defined as distinct integer values. This led to incorrect
mapping of OpenCL memory fence flags to LLVM memory scopes, causing
issues in generated code.

The fix involves updating the `__opencl_get_memory_scope` function to
return the correct `__MEMORY_SCOPE_*` values based on the provided
`cl_mem_fence_flags`. Additionally, the `__opencl_get_memory_semantics`
and the `__opencl_get_memory_scope` functions are marked as `static`
to avoid potential multiple definition issues during linking.
@llvmbot llvmbot added the libclc libclc OpenCL library label Dec 3, 2025
memory_scope |= __MEMORY_SCOPE_WRKGRP;
return memory_scope;
return __MEMORY_SCOPE_WRKGRP;
return __MEMORY_SCOPE_SINGLE;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure if the default value is appropriate. Should we call __builtin_unreachable(); like in __opencl_get_memory_semantics?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the OpenCL C spec, the synchronization functions allow the 0 to be passed as a valid argument value. When 0 is passed, the single-thread fence looks reasonable to me, because the implementation shouldn't issue any cross-thread memory synchronization in that case. The __builtin_unreachable() won't work correctly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @vmustya

@vmustya vmustya requested a review from wenju-he December 4, 2025 01:09
Copy link
Contributor

@wenju-he wenju-he left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@wenju-he wenju-he merged commit b8a5888 into llvm:main Dec 4, 2025
12 checks passed
kcloudy0717 pushed a commit to kcloudy0717/llvm-project that referenced this pull request Dec 4, 2025
The function `__opencl_get_memory_scope` incorrectly assumed that the
Clang built-in `__MEMORY_SCOPE_*` macros defined as bitmasks, while they
are actually defined as distinct integer values. This led to incorrect
mapping of OpenCL memory fence flags to LLVM memory scopes, causing
issues in generated code.

The fix involves updating the `__opencl_get_memory_scope` function to
return the correct `__MEMORY_SCOPE_*` values based on the provided
`cl_mem_fence_flags`. Additionally, the `__opencl_get_memory_semantics`
and the `__opencl_get_memory_scope` functions are marked as `static`
to avoid potential multiple definition issues during linking.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

libclc libclc OpenCL library

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants