Skip to content

Commit 423bdb2

Browse files
authored
[OpenCL] Add missing OpenCL 3.0 features to OpenCLExtensions.def; revert header-only macros (#168016)
Adds the remaining optional feature macros from the OpenCL C 3.0 spec (section 6.2.1 table). Targets can now enable these via OpenCLFeaturesMap returned by getSupportedOpenCLOpts(). Revert a84599f (header‑only feature macros). Header‑only macros are difficult to disable on SPIR-V targets, and the prior undef approach (a60b8f4) does not scale. After this PR, they can be disabled via `-cl-ext=-<feature>`. KhronosGroup/OpenCL-Docs#1328 also notes that unconditional definition of the header‑only macros in opencl-c-base.h should be removed.
1 parent 8439aeb commit 423bdb2

File tree

8 files changed

+371
-156
lines changed

8 files changed

+371
-156
lines changed

clang/docs/OpenCLSupport.rst

Lines changed: 5 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -217,9 +217,9 @@ This section explains how to extend clang with the new functionality.
217217

218218
**Parsing functionality**
219219

220-
If an extension modifies the standard parsing it needs to be added to
221-
the clang frontend source code. This also means that the associated macro
222-
indicating the presence of the extension should be added to clang.
220+
If a new extension is added it needs to be added to the clang frontend source
221+
code. This also means that the associated macro indicating the presence of the
222+
extension should be added to clang.
223223

224224
The default flow for adding a new extension into the frontend is to
225225
modify `OpenCLExtensions.def
@@ -242,21 +242,15 @@ with :option:`-cl-ext` command-line flags.
242242
**Library functionality**
243243

244244
If an extension adds functionality that does not modify standard language
245-
parsing it should not require modifying anything other than header files and
245+
parsing it may not require modifying anything other than header files and
246246
``OpenCLBuiltins.td`` detailed in :ref:`OpenCL builtins <opencl_builtins>`.
247247
Most commonly such extensions add functionality via libraries (by adding
248248
non-native types or functions) parsed regularly. Similar to other languages this
249249
is the most common way to add new functionality.
250250

251251
Clang has standard headers where new types and functions are being added,
252252
for more details refer to
253-
:ref:`the section on the OpenCL Header <opencl_header>`. The macros indicating
254-
the presence of such extensions can be added in the standard header files
255-
conditioned on target specific predefined macros or/and language version
256-
predefined macros (see `feature/extension preprocessor macros defined in
257-
opencl-c-base.h
258-
<https://github.com/llvm/llvm-project/blob/main/clang/lib/Headers/opencl-c-base.h>`__).
259-
253+
:ref:`the section on the OpenCL Header <opencl_header>`.
260254
**Pragmas**
261255

262256
Some extensions alter standard parsing dynamically via pragmas.

clang/docs/ReleaseNotes.rst

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -236,6 +236,11 @@ C23 Feature Support
236236

237237
Non-comprehensive list of changes in this release
238238
-------------------------------------------------
239+
- Removed OpenCL header-only feature macros (previously unconditionally enabled
240+
on SPIR-V and only selectively disabled via ``-D__undef_<feature>``). All
241+
OpenCL extensions and features are now centralized in OpenCLExtensions.def,
242+
allowing consistent control via ``getSupportedOpenCLOpts`` and ``-cl-ext``.
243+
239244
- Added ``__builtin_elementwise_ldexp``.
240245

241246
- Added ``__builtin_elementwise_fshl`` and ``__builtin_elementwise_fshr``.

clang/include/clang/Basic/OpenCLExtensions.def

Lines changed: 46 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -78,10 +78,55 @@ OPENCL_EXTENSION(cl_khr_depth_images, true, 120)
7878
OPENCL_EXTENSION(cl_khr_gl_msaa_sharing,true, 120)
7979

8080
// OpenCL 2.0.
81+
OPENCL_EXTENSION(cl_ext_float_atomics, false, 200)
82+
OPENCL_EXTENSION(cl_khr_extended_bit_ops, false, 200)
83+
OPENCL_EXTENSION(cl_khr_integer_dot_product, false, 200)
84+
OPENCL_EXTENSION(cl_khr_kernel_clock, false, 200)
8185
OPENCL_EXTENSION(cl_khr_mipmap_image, true, 200)
8286
OPENCL_EXTENSION(cl_khr_mipmap_image_writes, true, 200)
8387
OPENCL_EXTENSION(cl_khr_srgb_image_writes, true, 200)
88+
OPENCL_EXTENSION(cl_khr_subgroup_ballot, false, 200)
89+
OPENCL_EXTENSION(cl_khr_subgroup_clustered_reduce, false, 200)
90+
OPENCL_EXTENSION(cl_khr_subgroup_extended_types, false, 200)
91+
OPENCL_EXTENSION(cl_khr_subgroup_non_uniform_arithmetic, false, 200)
92+
OPENCL_EXTENSION(cl_khr_subgroup_non_uniform_vote, false, 200)
93+
OPENCL_EXTENSION(cl_khr_subgroup_rotate, false, 200)
94+
OPENCL_EXTENSION(cl_khr_subgroup_shuffle_relative, false, 200)
95+
OPENCL_EXTENSION(cl_khr_subgroup_shuffle, false, 200)
8496
OPENCL_EXTENSION(cl_khr_subgroups, true, 200)
97+
OPENCL_OPTIONALCOREFEATURE(__opencl_c_atomic_order_acq_rel, false, 200, OCL_C_20)
98+
OPENCL_OPTIONALCOREFEATURE(__opencl_c_atomic_order_seq_cst, false, 200, OCL_C_20)
99+
OPENCL_OPTIONALCOREFEATURE(__opencl_c_atomic_scope_all_devices, false, 200, OCL_C_20)
100+
OPENCL_OPTIONALCOREFEATURE(__opencl_c_atomic_scope_device, false, 200, OCL_C_20)
101+
OPENCL_OPTIONALCOREFEATURE(__opencl_c_device_enqueue, false, 200, OCL_C_20)
102+
OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp16_global_atomic_add, false, 200, OCL_C_20)
103+
OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp16_global_atomic_load_store, false, 200, OCL_C_20)
104+
OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp16_global_atomic_min_max, false, 200, OCL_C_20)
105+
OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp16_local_atomic_add, false, 200, OCL_C_20)
106+
OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp16_local_atomic_load_store, false, 200, OCL_C_20)
107+
OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp16_local_atomic_min_max, false, 200, OCL_C_20)
108+
OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp32_global_atomic_add, false, 200, OCL_C_20)
109+
OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp32_global_atomic_min_max, false, 200, OCL_C_20)
110+
OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp32_local_atomic_add, false, 200, OCL_C_20)
111+
OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp32_local_atomic_min_max, false, 200, OCL_C_20)
112+
OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp64_global_atomic_add, false, 200, OCL_C_20)
113+
OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp64_global_atomic_min_max, false, 200, OCL_C_20)
114+
OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp64_local_atomic_add, false, 200, OCL_C_20)
115+
OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp64_local_atomic_min_max, false, 200, OCL_C_20)
116+
OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_image_raw10_raw12, false, 200, OCL_C_20)
117+
OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_image_unorm_int_2_101010, false, 200, OCL_C_20)
118+
OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_image_unsigned_10x6_12x4_14x2, false, 200, OCL_C_20)
119+
OPENCL_OPTIONALCOREFEATURE(__opencl_c_generic_address_space, false, 200, OCL_C_20)
120+
OPENCL_OPTIONALCOREFEATURE(__opencl_c_images, false, 200, OCL_C_20)
121+
OPENCL_OPTIONALCOREFEATURE(__opencl_c_integer_dot_product_input_4x8bit, false, 200, OCL_C_20)
122+
OPENCL_OPTIONALCOREFEATURE(__opencl_c_integer_dot_product_input_4x8bit_packed, false, 200, OCL_C_20)
123+
OPENCL_OPTIONALCOREFEATURE(__opencl_c_kernel_clock_scope_device, false, 200, OCL_C_20)
124+
OPENCL_OPTIONALCOREFEATURE(__opencl_c_kernel_clock_scope_sub_group, false, 200, OCL_C_20)
125+
OPENCL_OPTIONALCOREFEATURE(__opencl_c_kernel_clock_scope_work_group, false, 200, OCL_C_20)
126+
OPENCL_OPTIONALCOREFEATURE(__opencl_c_pipes, false, 200, OCL_C_20)
127+
OPENCL_OPTIONALCOREFEATURE(__opencl_c_program_scope_global_variables, false, 200, OCL_C_20)
128+
OPENCL_OPTIONALCOREFEATURE(__opencl_c_read_write_images, false, 200, OCL_C_20)
129+
OPENCL_OPTIONALCOREFEATURE(__opencl_c_work_group_collective_functions, false, 200, OCL_C_20)
85130

86131
// Clang Extensions.
87132
OPENCL_EXTENSION(cl_clang_storage_class_specifiers, true, 100)
@@ -100,17 +145,9 @@ OPENCL_EXTENSION(cl_intel_subgroups_short, true, 120)
100145
OPENCL_EXTENSION(cl_intel_device_side_avc_motion_estimation, true, 120)
101146

102147
// OpenCL C 3.0 features (6.2.1. Features)
103-
OPENCL_OPTIONALCOREFEATURE(__opencl_c_pipes, false, 300, OCL_C_30)
104-
OPENCL_OPTIONALCOREFEATURE(__opencl_c_generic_address_space, false, 300, OCL_C_30)
105-
OPENCL_OPTIONALCOREFEATURE(__opencl_c_atomic_order_acq_rel, false, 300, OCL_C_30)
106-
OPENCL_OPTIONALCOREFEATURE(__opencl_c_atomic_order_seq_cst, false, 300, OCL_C_30)
107-
OPENCL_OPTIONALCOREFEATURE(__opencl_c_subgroups, false, 300, OCL_C_30)
108148
OPENCL_OPTIONALCOREFEATURE(__opencl_c_3d_image_writes, false, 300, OCL_C_30)
109-
OPENCL_OPTIONALCOREFEATURE(__opencl_c_device_enqueue, false, 300, OCL_C_30)
110-
OPENCL_OPTIONALCOREFEATURE(__opencl_c_read_write_images, false, 300, OCL_C_30)
111-
OPENCL_OPTIONALCOREFEATURE(__opencl_c_program_scope_global_variables, false, 300, OCL_C_30)
112149
OPENCL_OPTIONALCOREFEATURE(__opencl_c_fp64, false, 300, OCL_C_30)
113-
OPENCL_OPTIONALCOREFEATURE(__opencl_c_images, false, 300, OCL_C_30)
150+
OPENCL_OPTIONALCOREFEATURE(__opencl_c_subgroups, false, 300, OCL_C_30)
114151

115152
#undef OPENCL_OPTIONALCOREFEATURE
116153
#undef OPENCL_COREFEATURE

clang/lib/Basic/Targets/AMDGPU.h

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -320,6 +320,8 @@ class LLVM_LIBRARY_VISIBILITY AMDGPUTargetInfo final : public TargetInfo {
320320
Opts["__opencl_c_3d_image_writes"] = true;
321321
Opts["cl_khr_3d_image_writes"] = true;
322322
Opts["__opencl_c_program_scope_global_variables"] = true;
323+
Opts["__opencl_c_atomic_order_seq_cst"] = true;
324+
Opts["__opencl_c_atomic_scope_all_devices"] = true;
323325

324326
if (GPUKind >= llvm::AMDGPU::GK_GFX700) {
325327
Opts["__opencl_c_generic_address_space"] = true;

clang/lib/Headers/opencl-c-base.h

Lines changed: 0 additions & 99 deletions
Original file line numberDiff line numberDiff line change
@@ -9,105 +9,6 @@
99
#ifndef _OPENCL_BASE_H_
1010
#define _OPENCL_BASE_H_
1111

12-
// Define extension macros
13-
14-
#if (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
15-
// For SPIR and SPIR-V all extensions are supported.
16-
#if defined(__SPIR__) || defined(__SPIRV__)
17-
#define cl_khr_subgroup_extended_types 1
18-
#define cl_khr_subgroup_non_uniform_vote 1
19-
#define cl_khr_subgroup_ballot 1
20-
#define cl_khr_subgroup_non_uniform_arithmetic 1
21-
#define cl_khr_subgroup_shuffle 1
22-
#define cl_khr_subgroup_shuffle_relative 1
23-
#define cl_khr_subgroup_clustered_reduce 1
24-
#define cl_khr_subgroup_rotate 1
25-
#define cl_khr_extended_bit_ops 1
26-
#define cl_khr_integer_dot_product 1
27-
#define __opencl_c_integer_dot_product_input_4x8bit 1
28-
#define __opencl_c_integer_dot_product_input_4x8bit_packed 1
29-
#define cl_ext_float_atomics 1
30-
#ifdef cl_khr_fp16
31-
#define __opencl_c_ext_fp16_global_atomic_load_store 1
32-
#define __opencl_c_ext_fp16_local_atomic_load_store 1
33-
#define __opencl_c_ext_fp16_global_atomic_add 1
34-
#define __opencl_c_ext_fp16_local_atomic_add 1
35-
#define __opencl_c_ext_fp16_global_atomic_min_max 1
36-
#define __opencl_c_ext_fp16_local_atomic_min_max 1
37-
#endif
38-
#ifdef cl_khr_fp64
39-
#define __opencl_c_ext_fp64_global_atomic_add 1
40-
#define __opencl_c_ext_fp64_local_atomic_add 1
41-
#define __opencl_c_ext_fp64_global_atomic_min_max 1
42-
#define __opencl_c_ext_fp64_local_atomic_min_max 1
43-
#endif
44-
#define __opencl_c_ext_fp32_global_atomic_add 1
45-
#define __opencl_c_ext_fp32_local_atomic_add 1
46-
#define __opencl_c_ext_fp32_global_atomic_min_max 1
47-
#define __opencl_c_ext_fp32_local_atomic_min_max 1
48-
#define __opencl_c_ext_image_raw10_raw12 1
49-
#define __opencl_c_ext_image_unorm_int_2_101010 1
50-
#define __opencl_c_ext_image_unsigned_10x6_12x4_14x2 1
51-
#define cl_khr_kernel_clock 1
52-
#define __opencl_c_kernel_clock_scope_device 1
53-
#define __opencl_c_kernel_clock_scope_work_group 1
54-
#define __opencl_c_kernel_clock_scope_sub_group 1
55-
56-
#endif // defined(__SPIR__) || defined(__SPIRV__)
57-
#endif // (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
58-
59-
// Define feature macros for OpenCL C 2.0
60-
#if (__OPENCL_CPP_VERSION__ == 100 || __OPENCL_C_VERSION__ == 200)
61-
#define __opencl_c_pipes 1
62-
#define __opencl_c_generic_address_space 1
63-
#define __opencl_c_work_group_collective_functions 1
64-
#define __opencl_c_atomic_order_acq_rel 1
65-
#define __opencl_c_atomic_order_seq_cst 1
66-
#define __opencl_c_atomic_scope_device 1
67-
#define __opencl_c_atomic_scope_all_devices 1
68-
#define __opencl_c_device_enqueue 1
69-
#define __opencl_c_read_write_images 1
70-
#define __opencl_c_program_scope_global_variables 1
71-
#define __opencl_c_images 1
72-
#endif
73-
74-
// Define header-only feature macros for OpenCL C 3.0.
75-
#if (__OPENCL_CPP_VERSION__ == 202100 || __OPENCL_C_VERSION__ == 300)
76-
// For the SPIR and SPIR-V target all features are supported.
77-
#if defined(__SPIR__) || defined(__SPIRV__)
78-
#define __opencl_c_work_group_collective_functions 1
79-
#define __opencl_c_atomic_order_seq_cst 1
80-
#define __opencl_c_atomic_scope_device 1
81-
#define __opencl_c_atomic_scope_all_devices 1
82-
#define __opencl_c_read_write_images 1
83-
#endif // defined(__SPIR__)
84-
85-
#endif // (__OPENCL_CPP_VERSION__ == 202100 || __OPENCL_C_VERSION__ == 300)
86-
87-
// Undefine any feature macros that have been explicitly disabled using
88-
// an __undef_<feature> macro.
89-
#ifdef __undef___opencl_c_work_group_collective_functions
90-
#undef __opencl_c_work_group_collective_functions
91-
#endif
92-
#ifdef __undef___opencl_c_atomic_order_seq_cst
93-
#undef __opencl_c_atomic_order_seq_cst
94-
#endif
95-
#ifdef __undef___opencl_c_atomic_scope_device
96-
#undef __opencl_c_atomic_scope_device
97-
#endif
98-
#ifdef __undef___opencl_c_atomic_scope_all_devices
99-
#undef __opencl_c_atomic_scope_all_devices
100-
#endif
101-
#ifdef __undef___opencl_c_read_write_images
102-
#undef __opencl_c_read_write_images
103-
#endif
104-
#ifdef __undef___opencl_c_integer_dot_product_input_4x8bit
105-
#undef __opencl_c_integer_dot_product_input_4x8bit
106-
#endif
107-
#ifdef __undef___opencl_c_integer_dot_product_input_4x8bit_packed
108-
#undef __opencl_c_integer_dot_product_input_4x8bit_packed
109-
#endif
110-
11112
#if !defined(__opencl_c_generic_address_space)
11213
// Internal feature macro to provide named (global, local, private) address
11314
// space overloads for builtin functions that take a pointer argument.

clang/test/Headers/opencl-c-header.cl

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@
3333
// ===
3434
// Compile for OpenCL 2.0 for the first time. The module should change.
3535
// RUN: %clang_cc1 -triple spir-unknown-unknown -O0 -emit-llvm -o - -cl-std=CL2.0 -finclude-default-header -fmodules -fimplicit-module-maps -fmodules-cache-path=%t -fdisable-module-hash -ftime-report %s 2>&1 | FileCheck --check-prefix=CHECK20 --check-prefix=CHECK-MOD %s
36-
// RUN: not diff %t/1_0.pcm %t/opencl_c.pcm
36+
// RUN: not diff %t/1_0.pcm %t/opencl_c.pcm > /dev/null
3737
// RUN: chmod u-w %t/opencl_c.pcm
3838

3939
// ===
@@ -44,10 +44,10 @@
4444
// RUN: rm -rf %t
4545
// RUN: mkdir -p %t
4646
// RUN: %clang_cc1 -triple spir64-unknown-unknown -emit-llvm -o - -cl-std=CL1.2 -finclude-default-header -fmodules -fimplicit-module-maps -fmodules-cache-path=%t -ftime-report %s 2>&1 | FileCheck --check-prefix=CHECK --check-prefix=CHECK-MOD %s
47-
// RUN: %clang_cc1 -triple amdgcn--amdhsa -O0 -emit-llvm -o - -cl-std=CL2.0 -finclude-default-header -fmodules -fimplicit-module-maps -fmodules-cache-path=%t -ftime-report %s 2>&1 | FileCheck --check-prefix=CHECK20 --check-prefix=CHECK-MOD %s
47+
// RUN: %clang_cc1 -triple amdgcn--amdhsa -target-cpu gfx700 -O0 -emit-llvm -o - -cl-std=CL2.0 -finclude-default-header -fmodules -fimplicit-module-maps -fmodules-cache-path=%t -ftime-report %s 2>&1 | FileCheck --check-prefix=CHECK20 --check-prefix=CHECK-MOD %s
4848
// RUN: chmod u-w %t
4949
// RUN: %clang_cc1 -triple spir64-unknown-unknown -emit-llvm -o - -cl-std=CL1.2 -finclude-default-header -fmodules -fimplicit-module-maps -fmodules-cache-path=%t -ftime-report %s 2>&1 | FileCheck --check-prefix=CHECK --check-prefix=CHECK-MOD %s
50-
// RUN: %clang_cc1 -triple amdgcn--amdhsa -O0 -emit-llvm -o - -cl-std=CL2.0 -finclude-default-header -fmodules -fimplicit-module-maps -fmodules-cache-path=%t -ftime-report %s 2>&1 | FileCheck --check-prefix=CHECK20 --check-prefix=CHECK-MOD %s
50+
// RUN: %clang_cc1 -triple amdgcn--amdhsa -target-cpu gfx700 -O0 -emit-llvm -o - -cl-std=CL2.0 -finclude-default-header -fmodules -fimplicit-module-maps -fmodules-cache-path=%t -ftime-report %s 2>&1 | FileCheck --check-prefix=CHECK20 --check-prefix=CHECK-MOD %s
5151
// RUN: chmod u+w %t
5252

5353
// Verify that called builtins occur in the generated IR.

0 commit comments

Comments
 (0)