Skip to content

Commit 8d78ac2

Browse files
jhuber6tstellar
authored andcommitted
[OpenMP]Fix PR51349: Remove AlwaysInline for if regions.
After D94315 we add the `NoInline` attribute to the outlined function to handle data environments in the OpenMP if clause. This conflicted with the `AlwaysInline` attribute added to the outlined function. for better performance in D106799. The data environments should ideally not require NoInline, but for now this fixes PR51349. Reviewed By: mikerice Differential Revision: https://reviews.llvm.org/D107649 (cherry picked from commit 41a6b50)
1 parent d811546 commit 8d78ac2

File tree

2 files changed

+40
-1
lines changed

2 files changed

+40
-1
lines changed

clang/lib/CodeGen/CGOpenMPRuntime.cpp

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2120,11 +2120,12 @@ void CGOpenMPRuntime::emitParallelCall(CodeGenFunction &CGF, SourceLocation Loc,
21202120
OutlinedFnArgs.append(CapturedVars.begin(), CapturedVars.end());
21212121

21222122
// Ensure we do not inline the function. This is trivially true for the ones
2123-
// passed to __kmpc_fork_call but the ones calles in serialized regions
2123+
// passed to __kmpc_fork_call but the ones called in serialized regions
21242124
// could be inlined. This is not a perfect but it is closer to the invariant
21252125
// we want, namely, every data environment starts with a new function.
21262126
// TODO: We should pass the if condition to the runtime function and do the
21272127
// handling there. Much cleaner code.
2128+
OutlinedFn->removeFnAttr(llvm::Attribute::AlwaysInline);
21282129
OutlinedFn->addFnAttr(llvm::Attribute::NoInline);
21292130
RT.emitOutlinedFunctionCall(CGF, Loc, OutlinedFn, OutlinedFnArgs);
21302131

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --function-signature --check-attributes --include-generated-funcs
2+
// RUN: %clang_cc1 -x c++ -O1 -fopenmp-version=45 -disable-llvm-optzns -verify -fopenmp -triple x86_64-unknown-linux -emit-llvm %s -o - | FileCheck %s --check-prefix=CHECK
3+
// expected-no-diagnostics
4+
5+
#ifndef HEADER
6+
#define HEADER
7+
8+
void foo() {
9+
#pragma omp parallel if(0)
10+
;
11+
}
12+
13+
#endif
14+
// CHECK: Function Attrs: mustprogress nounwind
15+
// CHECK-LABEL: define {{[^@]+}}@_Z3foov
16+
// CHECK-SAME: () #[[ATTR0:[0-9]+]] {
17+
// CHECK-NEXT: entry:
18+
// CHECK-NEXT: [[DOTTHREADID_TEMP_:%.*]] = alloca i32, align 4
19+
// CHECK-NEXT: [[DOTBOUND_ZERO_ADDR:%.*]] = alloca i32, align 4
20+
// CHECK-NEXT: store i32 0, i32* [[DOTBOUND_ZERO_ADDR]], align 4
21+
// CHECK-NEXT: [[TMP0:%.*]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @[[GLOB1:[0-9]+]])
22+
// CHECK-NEXT: call void @__kmpc_serialized_parallel(%struct.ident_t* @[[GLOB1]], i32 [[TMP0]])
23+
// CHECK-NEXT: store i32 [[TMP0]], i32* [[DOTTHREADID_TEMP_]], align 4, !tbaa [[TBAA3:![0-9]+]]
24+
// CHECK-NEXT: call void @.omp_outlined.(i32* [[DOTTHREADID_TEMP_]], i32* [[DOTBOUND_ZERO_ADDR]]) #[[ATTR2:[0-9]+]]
25+
// CHECK-NEXT: call void @__kmpc_end_serialized_parallel(%struct.ident_t* @[[GLOB1]], i32 [[TMP0]])
26+
// CHECK-NEXT: ret void
27+
//
28+
//
29+
// CHECK: Function Attrs: noinline norecurse nounwind
30+
// CHECK-LABEL: define {{[^@]+}}@.omp_outlined.
31+
// CHECK-SAME: (i32* noalias [[DOTGLOBAL_TID_:%.*]], i32* noalias [[DOTBOUND_TID_:%.*]]) #[[ATTR1:[0-9]+]] {
32+
// CHECK-NEXT: entry:
33+
// CHECK-NEXT: [[DOTGLOBAL_TID__ADDR:%.*]] = alloca i32*, align 8
34+
// CHECK-NEXT: [[DOTBOUND_TID__ADDR:%.*]] = alloca i32*, align 8
35+
// CHECK-NEXT: store i32* [[DOTGLOBAL_TID_]], i32** [[DOTGLOBAL_TID__ADDR]], align 8, !tbaa [[TBAA7:![0-9]+]]
36+
// CHECK-NEXT: store i32* [[DOTBOUND_TID_]], i32** [[DOTBOUND_TID__ADDR]], align 8, !tbaa [[TBAA7]]
37+
// CHECK-NEXT: ret void
38+
//

0 commit comments

Comments
 (0)