Skip to content

[HLSL] Appropriately set function attribute optnone #125937

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Feb 11, 2025
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions clang/lib/CodeGen/CGHLSLRuntime.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -345,6 +345,9 @@ void clang::CodeGen::CGHLSLRuntime::setHLSLEntryAttributes(
WaveSizeAttr->getPreferred());
Fn->addFnAttr(WaveSizeKindStr, WaveSizeStr);
}
if (CGM.getCodeGenOpts().OptimizationLevel == 0) {
Fn->addFnAttr(llvm::Attribute::OptimizeNone);
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: https://llvm.org/docs/CodingStandards.html#don-t-use-braces-on-simple-single-statement-bodies-of-if-else-loop-statements

Suggested change
if (CGM.getCodeGenOpts().OptimizationLevel == 0) {
Fn->addFnAttr(llvm::Attribute::OptimizeNone);
}
if (CGM.getCodeGenOpts().OptimizationLevel == 0)
Fn->addFnAttr(llvm::Attribute::OptimizeNone);

Fn->addFnAttr(llvm::Attribute::NoInline);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that these already have "noinline", I'm surprised that the logic in "SetLLVMFunctionAttributesForDefinition" doesn't already put optnone on these functions. Is something undoing this later?

Copy link
Contributor Author

@bharadwajy bharadwajy Feb 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that these already have "noinline", I'm surprised that the logic in "SetLLVMFunctionAttributesForDefinition" doesn't already put optnone on these functions. Is something undoing this later?

The entry function that is created and whose attribute is set to noinline in this function is different from that SetLLVMFunctionAttributesForDefinition() looks at.

GenerateCode(GlobalFnDecl, MangledFn, ...) calls StartFunction(GlobalFnDecl, ResTy, MangledFn, ...) which in turn calls emitEntryFunction(FnDecl, MangledFn). emitEntryFunction(FnDecl, MangledFn, ...) constructs a new entry function EntryFn, sets linkage of MangledFn to be internal to arrange it to be inlined in EntryFn etc., and calls setHLSLEntryAttributes(FnDecl, EntryFn) to set attributes of EntryFn.

SetLLVMFunctionAttributesForDefinition(...) sets attributes of MangledFn. So the logic in that function checks attributes for MangledFn and not for the created EntryFn.

Hence setting optnone attribute in setHLSLEntryAttributes(FnDecl, EntryFn) of EntryFn at the time of its set up seemed appropriate - if optimizations are disabled.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. I think a comment to the effect of "We need to manually set attributes here instead of relying on SetLLVMFunctionAttributesForDefinition to pick them up since these functions are injected by the compiler and won't go through the normal flow" (please reword as necessary to be accurate...) would be a good idea here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. I think a comment to the effect of "We need to manually set attributes here instead of relying on SetLLVMFunctionAttributesForDefinition to pick them up since these functions are injected by the compiler and won't go through the normal flow" (please reword as necessary to be accurate...) would be a good idea here.

Comment added. Thanks!

}

Expand Down Expand Up @@ -446,6 +449,13 @@ void CGHLSLRuntime::setHLSLFunctionAttributes(const FunctionDecl *FD,
const StringRef ExportAttrKindStr = "hlsl.export";
Fn->addFnAttr(ExportAttrKindStr);
}
llvm::Triple T(Fn->getParent()->getTargetTriple());
if (T.getEnvironment() == llvm::Triple::EnvironmentType::Library) {
if (CGM.getCodeGenOpts().OptimizationLevel == 0) {
Fn->addFnAttr(llvm::Attribute::OptimizeNone);
Fn->addFnAttr(llvm::Attribute::NoInline);
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to do this on all functions in a library or just entry points and exported functions? In any case, it really would be preferable if "SetLLVMFunctionAttributesForDefinition" did the right thing (whatever that may be) rather than us needing to duplicate that logic here...

Copy link
Contributor Author

@bharadwajy bharadwajy Feb 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to do this on all functions in a library or just entry points and exported functions? In any case, it really would be preferable if "SetLLVMFunctionAttributesForDefinition" did the right thing (whatever that may be) rather than us needing to duplicate that logic here...

OK. It would be sufficient to set the optnone attribute just for entry functions of both non-library shaders and library shaders since all shaders will have one or more (respectively) entry functions. The shader flag DisableOptimizations can be set based on the presence of this attribute on entry function(s).

Deleted this change.

The consequences and utility in the later passes of setting optnone attribute for exported library functions when optimizatons are disabled is not very clear to me, yet. I'd like to propose any change, if needed, be done in a follow-on PR.

}

static void gatherFunctions(SmallVectorImpl<Function *> &Fns, llvm::Module &M,
Expand Down
8 changes: 4 additions & 4 deletions clang/test/CodeGenHLSL/GlobalConstructorLib.hlsl
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.3-library -emit-llvm -disable-llvm-passes %s -o - | FileCheck %s --check-prefixes=CHECK,NOINLINE
// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.3-library -emit-llvm -O0 %s -o - | FileCheck %s --check-prefixes=CHECK,INLINE
// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.3-library -emit-llvm -O1 %s -o - | FileCheck %s --check-prefixes=CHECK,INLINE

// Make sure global variable for ctors exist for lib profile.
// CHECK:@llvm.global_ctors
Expand Down Expand Up @@ -31,12 +31,12 @@ void SecondEntry() {}
// CHECK: ret void


// Verify the constructor is alwaysinline
// NOINLINE: ; Function Attrs: {{.*}}alwaysinline
// Verify the constructor is optnone
// NOINLINE: ; Function Attrs: {{.*}} optnone
// NOINLINE-NEXT: define linkonce_odr void @_ZN4hlsl8RWBufferIfEC2Ev({{.*}} [[CtorAttr:\#[0-9]+]]

// NOINLINE: ; Function Attrs: {{.*}}alwaysinline
// NOINLINE-NEXT: define internal void @_GLOBAL__sub_I_GlobalConstructorLib.hlsl() [[InitAttr:\#[0-9]+]]

// NOINLINE-DAG: attributes [[InitAttr]] = {{.*}} alwaysinline
// NOINLINE-DAG: attributes [[CtorAttr]] = {{.*}} alwaysinline
// NOINLINE-DAG: attributes [[CtorAttr]] = {{.*}} optnone
4 changes: 2 additions & 2 deletions clang/test/CodeGenHLSL/GlobalDestructors.hlsl
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.0-compute -emit-llvm -disable-llvm-passes %s -o - | FileCheck %s --check-prefixes=CS,NOINLINE,CHECK
// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.3-library -emit-llvm -disable-llvm-passes %s -o - | FileCheck %s --check-prefixes=LIB,NOINLINE,CHECK
// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.0-compute -emit-llvm -O0 %s -o - | FileCheck %s --check-prefixes=INLINE,CHECK
// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.3-library -emit-llvm -O0 %s -o - | FileCheck %s --check-prefixes=INLINE,CHECK
// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.0-compute -emit-llvm -O1 %s -o - | FileCheck %s --check-prefixes=INLINE,CHECK
// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.3-library -emit-llvm -O1 %s -o - | FileCheck %s --check-prefixes=INLINE,CHECK

// Tests that constructors and destructors are appropriately generated for globals
// and that their calls are inlined when AlwaysInline is run
Expand Down
16 changes: 10 additions & 6 deletions clang/test/CodeGenHLSL/inline-constructors.hlsl
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.0-compute -std=hlsl202x -emit-llvm -o - -disable-llvm-passes %s | FileCheck %s --check-prefixes=CHECK,NOINLINE
// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.3-library -std=hlsl202x -emit-llvm -o - -disable-llvm-passes %s | FileCheck %s --check-prefixes=CHECK,NOINLINE
// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.0-compute -std=hlsl202x -emit-llvm -o - -O0 %s | FileCheck %s --check-prefixes=CHECK,INLINE
// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.3-library -std=hlsl202x -emit-llvm -o - -O0 %s | FileCheck %s --check-prefixes=CHECK,INLINE
// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.0-compute -std=hlsl202x -emit-llvm -o - -O1 %s | FileCheck %s --check-prefixes=CHECK,INLINE
// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.3-library -std=hlsl202x -emit-llvm -o - -O1 %s | FileCheck %s --check-prefixes=CHECK,INLINE
// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.0-compute -std=hlsl202x -Wno-hlsl-extensions -emit-llvm -o - -disable-llvm-passes %s | FileCheck %s --check-prefixes=CHECK,NOINLINE
// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.3-library -std=hlsl202x -Wno-hlsl-extensions -emit-llvm -o - -disable-llvm-passes %s | FileCheck %s --check-prefixes=CHECK,NOINLINE
// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.0-compute -std=hlsl202x -Wno-hlsl-extensions -emit-llvm -o - -O0 %s | FileCheck %s --check-prefixes=CHECK,INLINE
// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.3-library -std=hlsl202x -Wno-hlsl-extensions -emit-llvm -o - -O0 %s | FileCheck %s --check-prefixes=CHECK,NOINLINE_LIB
// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.0-compute -std=hlsl202x -Wno-hlsl-extensions -emit-llvm -o - -O1 %s | FileCheck %s --check-prefixes=CHECK,INLINE
// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.3-library -std=hlsl202x -Wno-hlsl-extensions -emit-llvm -o - -O1 %s | FileCheck %s --check-prefixes=CHECK,INLINE

// Tests that implicit constructor calls for user classes will always be inlined.

Expand Down Expand Up @@ -50,6 +50,10 @@ void NionsDay(int hours) {
// NOINLINE-NEXT: call void @_GLOBAL__sub_I_inline_constructors.hlsl()
// NOINLINE-NEXT: %0 = call i32 @llvm.dx.flattened.thread.id.in.group()
// NOINLINE-NEXT: call void @_Z4mainj(i32 %0)
// NOINLINE_LIB: call void @_ZN4WeedC1Ev
// NOINLINE_LIB-NEXT: call void @_ZN5KittyC1Ev
// NOINLINE_LIB-NEXT: %0 = call i32 @llvm.dx.flattened.thread.id.in.group()
// NOINLINE_LIB-NEXT: call void @_Z4mainj(i32 %0)
// Verify inlining leaves only calls to "llvm." intrinsics
// INLINE-NOT: call {{[^@]*}} @{{[^l][^l][^v][^m][^\.]}}
// CHECK: ret void
Expand Down
189 changes: 144 additions & 45 deletions clang/test/CodeGenHLSL/inline-functions.hlsl
Original file line number Diff line number Diff line change
@@ -1,32 +1,60 @@
// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.3-library %s -emit-llvm -disable-llvm-passes -o - | FileCheck %s --check-prefixes=CHECK,NOINLINE
// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.3-library %s -emit-llvm -O0 -o - | FileCheck %s --check-prefixes=CHECK,INLINE
// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.3-library %s -emit-llvm -O1 -o - | FileCheck %s --check-prefixes=CHECK,INLINE
// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.0-compute %s -emit-llvm -disable-llvm-passes -o - | FileCheck %s --check-prefixes=CHECK,NOINLINE
// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.0-compute %s -emit-llvm -O0 -o - | FileCheck %s --check-prefixes=CHECK,INLINE
// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.0-compute %s -emit-llvm -O1 -o - | FileCheck %s --check-prefixes=CHECK,INLINE

// Tests that user functions will always be inlined.
// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.3-library %s -emit-llvm -disable-llvm-passes -o - | FileCheck %s --check-prefixes=CHECK,CHECK_LIB_OPTNONE
// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.3-library %s -emit-llvm -O0 -o - | FileCheck %s --check-prefixes=CHECK,CHECK_LIB_OPTNONE
// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.3-library %s -emit-llvm -O1 -o - | FileCheck %s --check-prefixes=CHECK,CHECK_OPT
// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.0-compute %s -emit-llvm -disable-llvm-passes -o - | FileCheck %s --check-prefixes=CHECK,CHECK_CS_OPTNONE_NOPASS
// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.0-compute %s -emit-llvm -O0 -o - | FileCheck %s --check-prefixes=CHECK,CHECK_CS_OPTNONE_PASS
// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.0-compute %s -emit-llvm -O1 -o - | FileCheck %s --check-prefixes=CHECK,CHECK_OPT

// Tests inlining of user functions based on specified optimization options.
// This includes exported functions and mangled entry point implementation functions.
// The unmangled entry functions must not be alwaysinlined.

#define MAX 100

float nums[MAX];

// Verify that all functions have the alwaysinline attribute
// NOINLINE: Function Attrs: alwaysinline
// NOINLINE: define void @_Z4swapA100_jjj(ptr noundef byval([100 x i32]) align 4 %Buf, i32 noundef %ix1, i32 noundef %ix2) [[IntAttr:\#[0-9]+]]
// NOINLINE: ret void
// Check optnone attribute for library target compilation
// CHECK_LIB_OPTNONE: Function Attrs:{{.*}}optnone
// CHECK_LIB_OPTNONE: define void @_Z4swapA100_jjj(ptr noundef byval([100 x i32]) align 4 %Buf, i32 noundef %ix1, i32 noundef %ix2) [[ExtAttr:\#[0-9]+]]

// Check alwaysinline attribute for non-entry functions of compute target compilation
// CHECK_CS_OPTNONE_NOPASS: Function Attrs: alwaysinline
// CHECK_CS_OPTNONE_NOPASS: define void @_Z4swapA100_jjj(ptr noundef byval([100 x i32]) align 4 %Buf, i32 noundef %ix1, i32 noundef %ix2) [[ExtAttr:\#[0-9]+]]

// Check alwaysinline attribute for non-entry functions of compute target compilation
// CHECK_CS_OPTNONE_PASS: Function Attrs: alwaysinline
// CHECK_CS_OPTNONE_PASS: define void @_Z4swapA100_jjj(ptr noundef byval([100 x i32]) align 4 %Buf, i32 noundef %ix1, i32 noundef %ix2) [[ExtAttr:\#[0-9]+]]

// Check alwaysinline attribute for opt compilation to library target
// CHECK_OPT: Function Attrs: alwaysinline
// CHECK_OPT: define void @_Z4swapA100_jjj(ptr noundef byval([100 x i32]) align 4 captures(none) %Buf, i32 noundef %ix1, i32 noundef %ix2) {{[a-z_ ]*}} [[SwapOptAttr:\#[0-9]+]]

// CHECK: ret void

// Swap the values of Buf at indices ix1 and ix2
void swap(unsigned Buf[MAX], unsigned ix1, unsigned ix2) {
float tmp = Buf[ix1];
Buf[ix1] = Buf[ix2];
Buf[ix2] = tmp;
}

// NOINLINE: Function Attrs: alwaysinline
// NOINLINE: define void @_Z10BubbleSortA100_jj(ptr noundef byval([100 x i32]) align 4 %Buf, i32 noundef %size) [[IntAttr]]
// NOINLINE: ret void
// Check optnone attribute for library target compilation
// CHECK_LIB_OPTNONE: Function Attrs:{{.*}}optnone
// CHECK_LIB_OPTNONE: define void @_Z10BubbleSortA100_jj(ptr noundef byval([100 x i32]) align 4 %Buf, i32 noundef %size) [[ExtAttr]]

// Check alwaysinline attribute for non-entry functions of compute target compilation
// CHECK_CS_OPTNONE_NOPASS: Function Attrs: alwaysinline
// CHECK_CS_OPTNONE_NOPASS: define void @_Z10BubbleSortA100_jj(ptr noundef byval([100 x i32]) align 4 %Buf, i32 noundef %size) [[ExtAttr]]

// Check alwaysinline attribute for non-entry functions of compute target compilation
// CHECK_CS_OPTNONE_PASS: Function Attrs: alwaysinline
// CHECK_CS_OPTNONE_PASS: define void @_Z10BubbleSortA100_jj(ptr noundef byval([100 x i32]) align 4 %Buf, i32 noundef %size) [[ExtAttr]]

// Check alwaysinline attribute for opt compilation to library target
// CHECK_OPT: Function Attrs: alwaysinline
// CHECK_OPT: define void @_Z10BubbleSortA100_jj(ptr noundef readonly byval([100 x i32]) align 4 captures(none) %Buf, i32 noundef %size) {{[a-z_ ]*}} [[BubOptAttr:\#[0-9]+]]

// CHECK: ret void

// Inefficiently sort Buf in place
void BubbleSort(unsigned Buf[MAX], unsigned size) {
bool swapped = true;
Expand All @@ -41,12 +69,26 @@ void BubbleSort(unsigned Buf[MAX], unsigned size) {
}
}

// Note ExtAttr is the inlined export set of attribs
// CHECK: Function Attrs: alwaysinline
// CHECK: define noundef i32 @_Z11RemoveDupesA100_jj(ptr {{[a-z_ ]*}}noundef byval([100 x i32]) align 4 {{.*}}%Buf, i32 noundef %size) {{[a-z_ ]*}}[[ExtAttr:\#[0-9]+]]
// CHECK: ret i32
// Check optnone attribute for library target compilation of exported function
// CHECK_LIB_OPTNONE: Function Attrs:{{.*}}optnone
// CHECK_LIB_OPTNONE: define noundef i32 @_Z11RemoveDupesA100_jj(ptr {{[a-z_ ]*}}noundef byval([100 x i32]) align 4 {{.*}}%Buf, i32 noundef %size) [[ExportAttr:\#[0-9]+]]
// Sort Buf and remove any duplicate values
// returns the number of values left

// Check alwaysinline attribute for exported function of compute target compilation
// CHECK_CS_OPTNONE_NOPASS: Function Attrs: alwaysinline
// CHECK_CS_OPTNONE_NOPASS: define noundef i32 @_Z11RemoveDupesA100_jj(ptr {{[a-z_ ]*}}noundef byval([100 x i32]) align 4 {{.*}}%Buf, i32 noundef %size) [[ExportAttr:\#[0-9]+]]

// Check alwaysinline attribute for exported function of compute target compilation
// CHECK_CS_OPTNONE_PASS: Function Attrs: alwaysinline
// CHECK_CS_OPTNONE_PASS: define noundef i32 @_Z11RemoveDupesA100_jj(ptr {{[a-z_ ]*}}noundef byval([100 x i32]) align 4 {{.*}}%Buf, i32 noundef %size) [[ExportAttr:\#[0-9]+]]

// Check alwaysinline attribute for exported function of library target compilation
// CHECK_OPT: Function Attrs: alwaysinline
// CHECK_OPT: define noundef i32 @_Z11RemoveDupesA100_jj(ptr noundef byval([100 x i32]) align 4 captures(none) %Buf, i32 noundef %size) {{[a-z_ ]*}} [[RemOptAttr:\#[0-9]+]]

// CHECK: ret i32

export
unsigned RemoveDupes(unsigned Buf[MAX], unsigned size) {
BubbleSort(Buf, size);
Expand All @@ -63,19 +105,44 @@ unsigned RemoveDupes(unsigned Buf[MAX], unsigned size) {

RWBuffer<unsigned> Indices;

// The mangled version of main only remains without inlining
// because it has internal linkage from the start
// Note main functions get the norecurse attrib, which IntAttr reflects
// NOINLINE: Function Attrs: alwaysinline
// NOINLINE: define internal void @_Z4mainj(i32 noundef %GI) [[IntAttr]]
// NOINLINE: ret void
// CHECK_LIB_OPTNONE: Function Attrs:{{.*}}optnone
// Internal function attributes are the same as those of source function's
// CHECK_LIB_OPTNONE: define internal void @_Z4mainj(i32 noundef %GI) [[ExtAttr]]
// CHECK_LIB_OPTNONE: ret void

// CHECK_CS_OPTNONE_NOPASS: Function Attrs: alwaysinline
// Internal function attributes are different from those of source function's
// CHECK_CS_OPTNONE_NOPASS: define internal void @_Z4mainj(i32 noundef %GI) [[ExtAttr]]
// CHECK_CS_OPTNONE_NOPASS: ret void

// Check internal function @_Z4mainj is not generated when LLVM passes enabled
// CHECK_CS_OPTNONE_PASS-NOT: define internal void @_Z4mainj

// Check internal function @_Z4mainj is not generated as it should be inlined
// for opt builds
// CHECK_OPT-NOT: define internal void @_Z4mainj

// The unmangled version is not inlined, EntryAttr reflects that
// CHECK: Function Attrs: {{.*}}noinline
// CHECK: define void @main() {{[a-z_ ]*}}[[EntryAttr:\#[0-9]+]]
// Make sure function calls are inlined when AlwaysInline is run
// This only leaves calls to llvm. intrinsics
// INLINE-NOT: call {{[^@]*}} @{{[^l][^l][^v][^m][^\.]}}
// CHECK_LIB_OPTNONE: Function Attrs: {{.*}}noinline
// CHECK_LIB_OPTNONE: define void @main() {{[a-z_ ]*}}[[EntryAttr:\#[0-9]+]]
// Make sure internal function is not inlined when optimization is disabled
// CHECK_LIB_OPTNONE: call void @_Z4mainj

// CHECK_CS_OPTNONE_NOPASS: Function Attrs:{{.*}}optnone
// CHECK_CS_OPTNONE_NOPASS: define void @main() {{[a-z_ ]*}}[[EntryAttr:\#[0-9]+]]
// Make sure internal function is not inlined when optimization is disabled
// CHECK_CS_OPTNONE_NOPASS: call void @_Z4mainj

// CHECK_CS_OPTNONE_PASS: Function Attrs: {{.*}}noinline
// CHECK_CS_OPTNONE_PASS: define void @main() {{[a-z_ ]*}}[[EntryAttr:\#[0-9]+]]
// Make sure internal function is inlined when LLVM passes are enabled
// CHECK_CS_OPTNONE_PASS: _Z4mainj.exit:

// CHECK_OPT: Function Attrs: {{.*}}noinline
// CHECK_OPT: define void @main() {{[a-z_ ]*}}[[EntryAttr:\#[0-9]+]]
// Make sure internal function is inlined as optimization is enabled
// CHECK_OPT: _Z4mainj.exit:

// CHECK: ret void

[numthreads(1,1,1)]
Expand All @@ -90,19 +157,41 @@ void main(unsigned int GI : SV_GroupIndex) {
tmpIndices[i] = Indices[i];
}

// The mangled version of main only remains without inlining
// because it has internal linkage from the start
// Note main functions get the norecurse attrib, which IntAttr reflects
// NOINLINE: Function Attrs: alwaysinline
// NOINLINE: define internal void @_Z6main10v() [[IntAttr]]
// NOINLINE: ret void
// CHECK_LIB_OPTNONE: Function Attrs:{{.*}}optnone
// CHECK_LIB_OPTNONE: define internal void @_Z6main10v() [[ExtAttr]]
// CHECK_LIB_OPTNONE: ret void

// CHECK_CS_OPTNONE_NOPASS: Function Attrs:{{.*}}alwaysinline
// CHECK_CS_OPTNONE_NOPASS: define internal void @_Z6main10v() [[ExtAttr]]
// CHECK_CS_OPTNONE_NOPASS: ret void

// Check internal function @_Z6main10v is not generated when LLVM passes are enabled
// CHECK_CS_OPTNONE_PASS-NOT: define internal void @_Z6main10v

// Check internal function @_Z6main10v is not generated as it should be inlined
// CHECK_OPT-NOT: define internal void @_Z6main10v

// The unmangled version is not inlined, EntryAttr reflects that
// CHECK: Function Attrs: {{.*}}noinline
// CHECK: define void @main10() {{[a-z_ ]*}}[[EntryAttr]]
// Make sure function calls are inlined when AlwaysInline is run
// This only leaves calls to llvm. intrinsics
// INLINE-NOT: call {{[^@]*}} @{{[^l][^l][^v][^m][^\.]}}
// CHECK_LIB_OPTNONE: Function Attrs: {{.*}}noinline
// CHECK_LIB_OPTNONE: define void @main10() {{[a-z_ ]*}}[[EntryAttr]]
// Make sure internal function is not inlined when optimization is disabled
// CHECK_LIB_OPTNONE: call void @_Z6main10v

// CHECK_CS_OPTNONE_NOPASS: Function Attrs: {{.*}}noinline
// CHECK_CS_OPTNONE_NOPASS: define void @main10() {{[a-z_ ]*}}[[EntryAttr]]
// Make sure internal function is not inlined when optimization is disabled
// CHECK_CS_OPTNONE_NOPASS: call void @_Z6main10v

// CHECK_CS_OPTNONE_PASS: Function Attrs: {{.*}}noinline
// CHECK_CS_OPTNONE_PASS: define void @main10() {{[a-z_ ]*}}[[EntryAttr]]
// Check internal function is inlined as optimization is enabled when LLVM passes
// are enabled
// CHECK_CS_OPTNONE_PASS: _Z6main10v.exit:

// CHECK_OPT: Function Attrs: {{.*}}noinline
// CHECK_OPT: define void @main10() {{[a-z_ ]*}}[[EntryAttr]]
// Make sure internal function is inlined as optimization is enabled
// CHECK_OPT: _Z6main10v.exit:
// CHECK: ret void

[numthreads(1,1,1)]
Expand All @@ -111,6 +200,16 @@ void main10() {
main(10);
}

// NOINLINE: attributes [[IntAttr]] = {{.*}} alwaysinline
// CHECK: attributes [[ExtAttr]] = {{.*}} alwaysinline
// CHECK: attributes [[EntryAttr]] = {{.*}} noinline
// CHECK_LIB_OPTNONE: attributes [[ExtAttr]] = {{.*}} optnone
// CHECK_LIB_OPTNONE: attributes [[ExportAttr]] = {{.*}} optnone

// CHECK_CS_OPTNONE_NOPASS: attributes [[ExtAttr]] ={{.*}} alwaysinline
// CHECK_CS_OPTNONE_NOPASS: attributes [[EntryAttr]] = {{.*}} noinline

// CHECK_CS_OPTNONE_PASS: attributes [[ExtAttr]] ={{.*}} alwaysinline
// CHECK_CS_OPTNONE_PASS: attributes [[EntryAttr]] = {{.*}} noinline

// CHECK_OPT: attributes [[SwapOptAttr]] ={{.*}} alwaysinline
// CHECK_OPT: attributes [[BubOptAttr]] ={{.*}} alwaysinline
// CHECK_OPT: attributes [[RemOptAttr]] ={{.*}} alwaysinline
// CHECK_OPT: attributes [[EntryAttr]] ={{.*}} noinline
6 changes: 4 additions & 2 deletions clang/test/CodeGenHLSL/this-assignment-overload.hlsl
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ void main() {
}

// This test makes a probably safe assumption that HLSL 202x includes operator overloading for assignment operators.
// CHECK: define linkonce_odr noundef i32 @_ZN4Pair8getFirstEv(ptr noundef nonnull align 4 dereferenceable(8) %this) #0 align 2 {
// CHECK: define linkonce_odr noundef i32 @_ZN4Pair8getFirstEv(ptr noundef nonnull align 4 dereferenceable(8) %this) [[Attr:\#[0-9]+]] align 2 {
// CHECK-NEXT:entry:
// CHECK-NEXT:%this.addr = alloca ptr, align 4
// CHECK-NEXT:%Another = alloca %struct.Pair, align 4
Expand All @@ -42,7 +42,7 @@ void main() {
// CHECK-NEXT:%0 = load i32, ptr %First2, align 4
// CHECK-NEXT:ret i32 %0

// CHECK: define linkonce_odr noundef i32 @_ZN4Pair9getSecondEv(ptr noundef nonnull align 4 dereferenceable(8) %this) #0 align 2 {
// CHECK: define linkonce_odr noundef i32 @_ZN4Pair9getSecondEv(ptr noundef nonnull align 4 dereferenceable(8) %this) [[Attr]] align 2 {
// CHECK-NEXT:entry:
// CHECK-NEXT:%this.addr = alloca ptr, align 4
// CHECK-NEXT:%agg.tmp = alloca %struct.Pair, align 4
Expand All @@ -53,3 +53,5 @@ void main() {
// CHECK-NEXT:%Second = getelementptr inbounds nuw %struct.Pair, ptr %this1, i32 0, i32 1
// CHECK-NEXT:%0 = load i32, ptr %Second, align 4
// CHECK-NEXT:ret i32 %0

// CHECK: attributes [[Attr]] = {{.*}}alwaysinline