Skip to content

Conversation

@wenju-he
Copy link
Contributor

@wenju-he wenju-he commented Sep 16, 2025

always_inline doesn't guarantee performance improvement.
Target-specific optimizations decide whether inlining is profitable.
Changes to amdgcn--amdhsa.bc:

  • _Z9__clc_logDv16_f and Z15__clc_remainderDv16_fS are not inlined.
  • sincos vector function code size has doubled due to apparent duplication.

Also replace typo _CLC_DECL with _CLC_DEF for function definition.

Some built-ins miss alwaysinline attribute due to the typo.
@wenju-he wenju-he requested a review from Copilot September 16, 2025 06:25
@llvmbot llvmbot added the libclc libclc OpenCL library label Sep 16, 2025
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes a widespread typo where _CLC_DECL is incorrectly used instead of _CLC_DEF for function definitions in libclc, causing some built-ins to miss the alwaysinline attribute.

  • Replaces _CLC_DECL with _CLC_DEF in function definitions across multiple files
  • Ensures proper inline attributes are applied to OpenCL C library functions
  • Affects math helpers, atomic operations, and shuffle functions

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.

Show a summary per file
File Description
libclc/clc/lib/generic/math/clc_sincos_helpers.inc Fixed macro for tangent helper function definition
libclc/clc/lib/generic/atomic/clc_atomic_def.inc Fixed macros for atomic operation function definitions
libclc/clc/lib/generic/atomic/clc_atomic_compare_exchange.inc Fixed macros for atomic compare exchange function definitions
libclc/clc/include/clc/misc/shuffle_def.inc Fixed macros for shuffle function definitions
libclc/clc/include/clc/misc/shuffle2_def.inc Fixed macros for shuffle2 function definitions

@wenju-he wenju-he requested a review from arsenm September 16, 2025 06:26
Copy link
Contributor

@arsenm arsenm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think all the always_inline attributes should be removed, That's a good bug

@wenju-he wenju-he changed the title [libclc] Replace typo _CLC_DECL with _CLC_DEF for function definition [libclc] Remove __attribute__((always_inline)) Sep 16, 2025
@wenju-he
Copy link
Contributor Author

I think all the always_inline attributes should be removed, That's a good bug

done, thanks.
amdgcn--amdhsa.bc has two changes:

  1. _Z9__clc_logDv16_f and Z15__clc_remainderDv16_fS are not inlined.
  2. sincos vector function code size has doubled due to apparent duplication.

@wenju-he wenju-he merged commit 7f36611 into llvm:main Sep 17, 2025
9 checks passed
@wenju-he wenju-he deleted the _CLC_DECL-to-_CLC_DEF branch September 17, 2025 23:47
@mgorny
Copy link
Member

mgorny commented Sep 20, 2025

This is causing test failures for me:

Start testing: Sep 20 09:13 -00
----------------------------------------------------------
2/3 Testing: external-calls-amdgcn--amdhsa.bc
2/3 Test: external-calls-amdgcn--amdhsa.bc
Command: "/var/tmp/portage/llvm-core/libclc-22.0.0.9999/work/libclc/check_external_calls.sh" "/var/tmp/portage/llvm-core/libclc-22.0.0.9999/work/libclc_build/amdgcn--amdhsa.bc" "/usr/lib/llvm/22/bin"
Directory: /var/tmp/portage/llvm-core/libclc-22.0.0.9999/work/libclc
"external-calls-amdgcn--amdhsa.bc" start time: Sep 20 09:13 -00
Output:
----------------------------------------------------------
ERROR: 5 unresolved calls detected in /var/tmp/portage/llvm-core/libclc-22.0.0.9999/work/libclc_build/amdgcn--amdhsa.bc
  %23 = tail call fastcc <16 x float> @_Z9__clc_logDv16_f(<16 x float> noundef %22) #20
  %24 = tail call fastcc <16 x float> @_Z9__clc_logDv16_f(<16 x float> noundef %23) #20
  %2 = tail call fastcc <16 x float> @_Z9__clc_logDv16_f(<16 x float> noundef %0) #20
  %3 = tail call fastcc <16 x float> @_Z15__clc_remainderDv16_fS_(<16 x float> noundef %0, <16 x float> noundef %1) #20
  %5 = tail call fastcc <16 x float> @_Z15__clc_remainderDv16_fS_(<16 x float> noundef %3, <16 x float> noundef %4) #20
<end of output>
Test time =   2.89 sec
----------------------------------------------------------
Test Failed.
"external-calls-amdgcn--amdhsa.bc" end time: Sep 20 09:13 -00
"external-calls-amdgcn--amdhsa.bc" time elapsed: 00:00:02
----------------------------------------------------------

3/3 Testing: external-calls-tahiti-amdgcn-mesa-mesa3d.bc
3/3 Test: external-calls-tahiti-amdgcn-mesa-mesa3d.bc
Command: "/var/tmp/portage/llvm-core/libclc-22.0.0.9999/work/libclc/check_external_calls.sh" "/var/tmp/portage/llvm-core/libclc-22.0.0.9999/work/libclc_build/tahiti-amdgcn-mesa-mesa3d.bc" "/usr/lib/llvm/22/bin"
Directory: /var/tmp/portage/llvm-core/libclc-22.0.0.9999/work/libclc
"external-calls-tahiti-amdgcn-mesa-mesa3d.bc" start time: Sep 20 09:13 -00
Output:
----------------------------------------------------------
ERROR: 5 unresolved calls detected in /var/tmp/portage/llvm-core/libclc-22.0.0.9999/work/libclc_build/tahiti-amdgcn-mesa-mesa3d.bc
  %23 = tail call fastcc <16 x float> @_Z9__clc_logDv16_f(<16 x float> noundef %22) #22
  %24 = tail call fastcc <16 x float> @_Z9__clc_logDv16_f(<16 x float> noundef %23) #22
  %2 = tail call fastcc <16 x float> @_Z9__clc_logDv16_f(<16 x float> noundef %0) #22
  %3 = tail call fastcc <16 x float> @_Z15__clc_remainderDv16_fS_(<16 x float> noundef %0, <16 x float> noundef %1) #22
  %5 = tail call fastcc <16 x float> @_Z15__clc_remainderDv16_fS_(<16 x float> noundef %3, <16 x float> noundef %4) #22
<end of output>
Test time =   2.89 sec
----------------------------------------------------------
Test Failed.
"external-calls-tahiti-amdgcn-mesa-mesa3d.bc" end time: Sep 20 09:13 -00
"external-calls-tahiti-amdgcn-mesa-mesa3d.bc" time elapsed: 00:00:02
----------------------------------------------------------

1/3 Testing: external-calls-tahiti-amdgcn--.bc
1/3 Test: external-calls-tahiti-amdgcn--.bc
Command: "/var/tmp/portage/llvm-core/libclc-22.0.0.9999/work/libclc/check_external_calls.sh" "/var/tmp/portage/llvm-core/libclc-22.0.0.9999/work/libclc_build/tahiti-amdgcn--.bc" "/usr/lib/llvm/22/bin"
Directory: /var/tmp/portage/llvm-core/libclc-22.0.0.9999/work/libclc
"external-calls-tahiti-amdgcn--.bc" start time: Sep 20 09:13 -00
Output:
----------------------------------------------------------
ERROR: 5 unresolved calls detected in /var/tmp/portage/llvm-core/libclc-22.0.0.9999/work/libclc_build/tahiti-amdgcn--.bc
  %23 = tail call fastcc <16 x float> @_Z9__clc_logDv16_f(<16 x float> noundef %22) #22
  %24 = tail call fastcc <16 x float> @_Z9__clc_logDv16_f(<16 x float> noundef %23) #22
  %2 = tail call fastcc <16 x float> @_Z9__clc_logDv16_f(<16 x float> noundef %0) #22
  %3 = tail call fastcc <16 x float> @_Z15__clc_remainderDv16_fS_(<16 x float> noundef %0, <16 x float> noundef %1) #22
  %5 = tail call fastcc <16 x float> @_Z15__clc_remainderDv16_fS_(<16 x float> noundef %3, <16 x float> noundef %4) #22
<end of output>
Test time =   2.90 sec
----------------------------------------------------------
Test Failed.
"external-calls-tahiti-amdgcn--.bc" end time: Sep 20 09:13 -00
"external-calls-tahiti-amdgcn--.bc" time elapsed: 00:00:02
----------------------------------------------------------

End testing: Sep 20 09:13 -00

@wenju-he
Copy link
Contributor Author

@mgorny thank you for the testing. Sorry I didn't run the tests for this PR.
The test failures will be fixed by #160036

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

libclc libclc OpenCL library

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants