Skip to content

Commit e0f9bcc

Browse files
ilovepinickdesaulniersepilk
committed
[clang] Limit lifetimes of temporaries to the full expression
We have several issues describing suboptimal stack usage related to the lifetimes of temporary objects, such as #68747, #43598, and #109204. Previously, https://reviews.llvm.org/D74094 tried to address this. In that review, a few issues were brought up, particularly a concern about the lifetimes of the temporaries needing to be extended to end of the full expression. While there are arguably more optimal lifetime bounds we could enforce, for now we can conservatively make them extend to the end of the full expression, and later refine the optimization to use tighter bounds (or perhaps a better mechanism in the middle end?). Fixes #68747 Co-authored-by: Nick Desaulniers <[email protected]> Co-authored-by: Erik Pilkington <[email protected]>
1 parent 540fd18 commit e0f9bcc

File tree

9 files changed

+250
-1
lines changed

9 files changed

+250
-1
lines changed

clang/docs/ReleaseNotes.rst

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -86,6 +86,15 @@ Potentially Breaking Changes
8686
options-related code has been moved out of the Driver into a separate library.
8787
- The ``clangFrontend`` library no longer depends on ``clangDriver``, which may
8888
break downstream projects that relied on this transitive dependency.
89+
- Clang is now more precise with regards to the lifetime of temporary objects
90+
such as when aggregates are passed by value to a function, resulting in
91+
better sharing of stack slots and reduced stack usage. This change can lead
92+
to use-after-scope related issues in code that unintentionally relied on the
93+
previous behavior. If recompiling with ``-fsanitize=address`` shows a
94+
use-after-scope warning, then this is likely the case, and the report printed
95+
should be able to help users pinpoint where the use-after-scope is occurring.
96+
Users can use ``-Xclang -sloppy-temporary-lifetimes`` to retain the old
97+
behavior until they are able to find and resolve issues in their code.
8998

9099
C/C++ Language Potentially Breaking Changes
91100
-------------------------------------------

clang/include/clang/Basic/CodeGenOptions.def

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -475,6 +475,10 @@ ENUM_CODEGENOPT(ZeroCallUsedRegs, ZeroCallUsedRegsKind,
475475
/// non-deleting destructors. (No effect on Microsoft ABI.)
476476
CODEGENOPT(CtorDtorReturnThis, 1, 0, Benign)
477477

478+
/// Set via -Xclang -sloppy-temporary-lifetimes to disable emission of lifetime
479+
/// marker intrinsic calls.
480+
CODEGENOPT(NoLifetimeMarkersForTemporaries, 1, 0, Benign)
481+
478482
/// Enables emitting Import Call sections on supported targets that can be used
479483
/// by the Windows kernel to enable import call optimization.
480484
CODEGENOPT(ImportCallOptimization, 1, 0, Benign)

clang/include/clang/Options/Options.td

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8141,6 +8141,11 @@ def import_call_optimization : Flag<["-"], "import-call-optimization">,
81418141
def replaceable_function: Joined<["-"], "loader-replaceable-function=">,
81428142
MarshallingInfoStringVector<CodeGenOpts<"LoaderReplaceableFunctionNames">>;
81438143

8144+
def sloppy_temporary_lifetimes
8145+
: Flag<["-"], "sloppy-temporary-lifetimes">,
8146+
HelpText<"Don't emit lifetime markers for temporary objects">,
8147+
MarshallingInfoFlag<CodeGenOpts<"NoLifetimeMarkersForTemporaries">>;
8148+
81448149
} // let Visibility = [CC1Option]
81458150

81468151
//===----------------------------------------------------------------------===//

clang/lib/CodeGen/CGCall.cpp

Lines changed: 18 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -43,6 +43,7 @@
4343
#include "llvm/IR/IntrinsicInst.h"
4444
#include "llvm/IR/Intrinsics.h"
4545
#include "llvm/IR/Type.h"
46+
#include "llvm/Support/TypeSize.h"
4647
#include "llvm/Transforms/Utils/Local.h"
4748
#include <optional>
4849
using namespace clang;
@@ -4947,7 +4948,23 @@ void CodeGenFunction::EmitCallArg(CallArgList &args, const Expr *E,
49474948
return;
49484949
}
49494950

4950-
args.add(EmitAnyExprToTemp(E), type);
4951+
AggValueSlot ArgSlot = AggValueSlot::ignored();
4952+
// If the callee returns a reference, skip this stack saving optimization;
4953+
// we don't want to prematurely end the lifetime of the temporary. It may be
4954+
// possible to still perform this optimization if the return type is a
4955+
// reference to a different type than the parameter.
4956+
if (hasAggregateEvaluationKind(E->getType())) {
4957+
RawAddress ArgSlotAlloca = Address::invalid();
4958+
ArgSlot = CreateAggTemp(E->getType(), "agg.tmp", &ArgSlotAlloca);
4959+
4960+
// Emit a lifetime start/end for this temporary at the end of the full
4961+
// expression.
4962+
if (!CGM.getCodeGenOpts().NoLifetimeMarkersForTemporaries &&
4963+
EmitLifetimeStart(ArgSlotAlloca.getPointer()))
4964+
pushFullExprCleanup<CallLifetimeEnd>(NormalAndEHCleanup, ArgSlotAlloca);
4965+
}
4966+
4967+
args.add(EmitAnyExpr(E, ArgSlot), type);
49514968
}
49524969

49534970
QualType CodeGenFunction::getVarArgType(const Expr *Arg) {
Lines changed: 98 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,98 @@
1+
// RUN: %clang -cc1 -triple x86_64-apple-macos -O1 -disable-llvm-passes %s \
2+
// RUN: -emit-llvm -o - | FileCheck %s --implicit-check-not=llvm.lifetime
3+
// RUN: %clang -cc1 -xc++ -std=c++17 -triple x86_64-apple-macos -O1 \
4+
// RUN: -disable-llvm-passes %s -emit-llvm -o - -Wno-return-type-c-linkage | \
5+
// RUN: FileCheck %s --implicit-check-not=llvm.lifetime --check-prefixes=CHECK,CXX
6+
// RUN: %clang -cc1 -xobjective-c -triple x86_64-apple-macos -O1 \
7+
// RUN: -disable-llvm-passes %s -emit-llvm -o - | \
8+
// RUN: FileCheck %s --implicit-check-not=llvm.lifetime --check-prefixes=CHECK,OBJC
9+
// RUN: %clang -cc1 -triple x86_64-apple-macos -O1 -disable-llvm-passes %s \
10+
// RUN: -emit-llvm -o - -sloppy-temporary-lifetimes | \
11+
// RUN: FileCheck %s --implicit-check-not=llvm.lifetime --check-prefixes=SLOPPY
12+
13+
typedef struct { int x[100]; } aggregate;
14+
15+
#ifdef __cplusplus
16+
extern "C" {
17+
#endif
18+
19+
void takes_aggregate(aggregate);
20+
aggregate gives_aggregate();
21+
22+
// CHECK-LABEL: define void @t1
23+
void t1() {
24+
takes_aggregate(gives_aggregate());
25+
26+
// CHECK: [[AGGTMP:%.*]] = alloca %struct.aggregate, align 8
27+
// CHECK: call void @llvm.lifetime.start.p0(ptr [[AGGTMP]])
28+
// CHECK: call void{{.*}} @gives_aggregate(ptr{{.*}}sret(%struct.aggregate) align 4 [[AGGTMP]])
29+
// CHECK: call void @takes_aggregate(ptr noundef byval(%struct.aggregate) align 8 [[AGGTMP]])
30+
// CHECK: call void @llvm.lifetime.end.p0(ptr [[AGGTMP]])
31+
32+
// SLOPPY: [[AGGTMP:%.*]] = alloca %struct.aggregate, align 8
33+
// SLOPPY-NEXT: call void (ptr, ...) @gives_aggregate(ptr{{.*}}sret(%struct.aggregate) align 4 [[AGGTMP]])
34+
// SLOPPY-NEXT: call void @takes_aggregate(ptr noundef byval(%struct.aggregate) align 8 [[AGGTMP]])
35+
}
36+
37+
// CHECK: declare {{.*}}llvm.lifetime.start
38+
// CHECK: declare {{.*}}llvm.lifetime.end
39+
40+
#ifdef __cplusplus
41+
// CXX: define void @t2
42+
void t2() {
43+
struct S {
44+
S(aggregate) {}
45+
};
46+
S{gives_aggregate()};
47+
48+
// CXX: [[AGG:%.*]] = alloca %struct.aggregate
49+
// CXX: call void @llvm.lifetime.start.p0(ptr [[AGG]]
50+
// CXX: call void @gives_aggregate(ptr{{.*}}sret(%struct.aggregate) align 4 [[AGG]])
51+
// CXX: call void @_ZZ2t2EN1SC1E9aggregate(ptr {{.*}}, ptr {{.*}} byval(%struct.aggregate) align 8 [[AGG]])
52+
// CXX: call void @llvm.lifetime.end.p0(ptr [[AGG]]
53+
}
54+
55+
struct Dtor {
56+
~Dtor();
57+
};
58+
59+
void takes_dtor(Dtor);
60+
Dtor gives_dtor();
61+
62+
// CXX: define void @t3
63+
void t3() {
64+
takes_dtor(gives_dtor());
65+
66+
// CXX: [[AGG:%.*]] = alloca %struct.Dtor
67+
// CXX: call void @llvm.lifetime.start.p0(ptr [[AGG]])
68+
// CXX: call void @gives_dtor(ptr{{.*}}sret(%struct.Dtor) align 1 [[AGG]])
69+
// CXX: call void @takes_dtor(ptr noundef [[AGG]])
70+
// CXX: call void @_ZN4DtorD1Ev(ptr {{.*}} [[AGG]])
71+
// CXX: call void @llvm.lifetime.end.p0(ptr [[AGG]])
72+
// CXX: ret void
73+
}
74+
75+
#endif
76+
77+
#ifdef __OBJC__
78+
79+
@interface X
80+
-m:(aggregate)x;
81+
@end
82+
83+
// OBJC: define void @t4
84+
void t4(X *x) {
85+
[x m: gives_aggregate()];
86+
87+
// OBJC: [[AGG:%.*]] = alloca %struct.aggregate
88+
// OBJC: call void @llvm.lifetime.start.p0(ptr [[AGG]]
89+
// OBJC: call void{{.*}} @gives_aggregate(ptr{{.*}}sret(%struct.aggregate) align 4 [[AGGTMP]])
90+
// OBJC: call {{.*}}@objc_msgSend
91+
// OBJC: call void @llvm.lifetime.end.p0(ptr [[AGG]]
92+
}
93+
94+
#endif
95+
96+
#ifdef __cplusplus
97+
}
98+
#endif
Lines changed: 89 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,89 @@
1+
// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -O1 -emit-codegen-only -Rpass-analysis=prologepilog %s -verify=x86-precise
2+
// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -O1 -emit-codegen-only -Rpass-analysis=prologepilog -sloppy-temporary-lifetimes %s -verify=x86-sloppy
3+
4+
// RUN: %clang_cc1 -triple aarch64-unknown-linux-gnu -O1 -emit-codegen-only -Rpass-analysis=prologepilog %s -verify=aarch64-precise
5+
// RUN: %clang_cc1 -triple aarch64-unknown-linux-gnu -O1 -emit-codegen-only -Rpass-analysis=prologepilog -sloppy-temporary-lifetimes %s -verify=aarch64-sloppy
6+
7+
// RUN: %clang_cc1 -triple riscv64-unknown-linux-gnu -O1 -emit-codegen-only -Rpass-analysis=prologepilog %s -verify=riscv-precise
8+
// RUN: %clang_cc1 -triple riscv64-unknown-linux-gnu -O1 -emit-codegen-only -Rpass-analysis=prologepilog -sloppy-temporary-lifetimes %s -verify=riscv-sloppy
9+
10+
// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -O1 -emit-codegen-only -Rpass-analysis=prologepilog %s -verify=x86-precise -xc++
11+
// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -O1 -emit-codegen-only -Rpass-analysis=prologepilog -sloppy-temporary-lifetimes %s -verify=x86-sloppy -xc++
12+
13+
// RUN: %clang_cc1 -triple aarch64-unknown-linux-gnu -O1 -emit-codegen-only -Rpass-analysis=prologepilog %s -verify=aarch64-precise -xc++
14+
// RUN: %clang_cc1 -triple aarch64-unknown-linux-gnu -O1 -emit-codegen-only -Rpass-analysis=prologepilog -sloppy-temporary-lifetimes %s -verify=aarch64-sloppy -xc++
15+
16+
// RUN: %clang_cc1 -triple riscv64-unknown-linux-gnu -O1 -emit-codegen-only -Rpass-analysis=prologepilog %s -verify=riscv-precise -xc++
17+
// RUN: %clang_cc1 -triple riscv64-unknown-linux-gnu -O1 -emit-codegen-only -Rpass-analysis=prologepilog -sloppy-temporary-lifetimes %s -verify=riscv-sloppy -xc++
18+
19+
20+
typedef struct { char x[32]; } A;
21+
typedef struct { char *w, *x, *y, *z; } B;
22+
23+
void useA(A);
24+
void useB(B);
25+
A genA(void);
26+
B genB(void);
27+
28+
void t1(int c) {
29+
// x86-precise-remark@-1 {{40 stack bytes}}
30+
// x86-sloppy-remark@-2 {{72 stack bytes}}
31+
// aarch64-precise-remark@-3 {{48 stack bytes}}
32+
// aarch64-sloppy-remark@-4 {{80 stack bytes}}
33+
// riscv-precise-remark@-5 {{48 stack bytes}}
34+
// riscv-sloppy-remark@-6 {{80 stack bytes}}
35+
36+
if (c)
37+
useA(genA());
38+
else
39+
useA(genA());
40+
}
41+
42+
void t2(void) {
43+
// x86-precise-remark@-1 {{72 stack bytes}}
44+
// x86-sloppy-remark@-2 {{72 stack bytes}}
45+
// aarch64-precise-remark@-3 {{80 stack bytes}}
46+
// aarch64-sloppy-remark@-4 {{80 stack bytes}}
47+
// riscv-precise-remark@-5 {{80 stack bytes}}
48+
// riscv-sloppy-remark@-6 {{80 stack bytes}}
49+
50+
useA(genA());
51+
useA(genA());
52+
}
53+
54+
void t3(void) {
55+
// x86-precise-remark@-1 {{72 stack bytes}}
56+
// x86-sloppy-remark@-2 {{72 stack bytes}}
57+
// aarch64-precise-remark@-3 {{80 stack bytes}}
58+
// aarch64-sloppy-remark@-4 {{80 stack bytes}}
59+
// riscv-precise-remark@-5 {{80 stack bytes}}
60+
// riscv-sloppy-remark@-6 {{80 stack bytes}}
61+
62+
useB(genB());
63+
useB(genB());
64+
}
65+
66+
#ifdef __cplusplus
67+
struct C {
68+
char x[24];
69+
char *ptr;
70+
~C() {};
71+
};
72+
73+
void useC(C);
74+
C genC(void);
75+
76+
// This case works in C++, since its AST is structured slightly differently
77+
// than it is in C (CompundStmt/ExprWithCleanup/CallExpr vs CompundStmt/CallExpr).
78+
void t4() {
79+
// x86-precise-remark@-1 {{40 stack bytes}}
80+
// x86-sloppy-remark@-2 {{72 stack bytes}}
81+
// aarch64-precise-remark@-3 {{48 stack bytes}}
82+
// aarch64-sloppy-remark@-4 {{80 stack bytes}}
83+
// riscv-precise-remark@-5 {{48 stack bytes}}
84+
// riscv-sloppy-remark@-6 {{80 stack bytes}}
85+
86+
useC(genC());
87+
useC(genC());
88+
}
89+
#endif
Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -emit-llvm -O3 -disable-llvm-passes -o - %s | FileCheck %s
2+
3+
struct A {
4+
float x, y, z, w;
5+
};
6+
7+
void foo(A a);
8+
9+
// CHECK-LABEL: @_Z4testv
10+
// CHECK: [[A:%.*]] = alloca [[STRUCT_A:%.*]], align 4, addrspace(5)
11+
// CHECK-NEXT: [[AGG_TMP:%.*]] = alloca [[STRUCT_A]], align 4, addrspace(5)
12+
// CHECK-NEXT: [[A_ASCAST:%.*]] = addrspacecast ptr addrspace(5) [[A]] to ptr
13+
// CHECK-NEXT: [[AGG_TMP_ASCAST:%.*]] = addrspacecast ptr addrspace(5) [[AGG_TMP]] to ptr
14+
// CHECK-NEXT: call void @llvm.lifetime.start.p5(i64 16, ptr addrspace(5) [[A]]) #[[ATTR4:[0-9]+]]
15+
// CHECK-NEXT: call void @llvm.lifetime.start.p5(i64 16, ptr addrspace(5) [[AGG_TMP]]) #[[ATTR4]]
16+
void test() {
17+
A a;
18+
foo(a);
19+
}

clang/test/CodeGenCXX/stack-reuse-miscompile.cpp

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,8 @@ const char * f(S s)
2626
// CHECK: [[T2:%.*]] = alloca %class.T, align 4
2727
// CHECK: [[T3:%.*]] = alloca %class.T, align 4
2828
//
29+
// CHECK: [[AGG:%.*]] = alloca %class.S, align 4
30+
//
2931
// FIXME: We could defer starting the lifetime of the return object of concat
3032
// until the call.
3133
// CHECK: call void @llvm.lifetime.start.p0(ptr [[T1]])
@@ -34,10 +36,12 @@ const char * f(S s)
3436
// CHECK: [[T4:%.*]] = call noundef ptr @_ZN1TC1EPKc(ptr {{[^,]*}} [[T2]], ptr noundef @.str)
3537
//
3638
// CHECK: call void @llvm.lifetime.start.p0(ptr [[T3]])
39+
// CHECK: call void @llvm.lifetime.start.p0(ptr [[AGG]])
3740
// CHECK: [[T5:%.*]] = call noundef ptr @_ZN1TC1E1S(ptr {{[^,]*}} [[T3]], [2 x i32] %{{.*}})
3841
//
3942
// CHECK: call void @_ZNK1T6concatERKS_(ptr dead_on_unwind writable sret(%class.T) align 4 [[T1]], ptr {{[^,]*}} [[T2]], ptr noundef nonnull align 4 dereferenceable(16) [[T3]])
4043
// CHECK: [[T6:%.*]] = call noundef ptr @_ZNK1T3strEv(ptr {{[^,]*}} [[T1]])
44+
// CHECK: call void @llvm.lifetime.end.p0(ptr [[AGG]])
4145
//
4246
// CHECK: call void @llvm.lifetime.end.p0(
4347
// CHECK: call void @llvm.lifetime.end.p0(

clang/test/CodeGenCoroutines/pr59181.cpp

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -49,6 +49,7 @@ void foo() {
4949
}
5050

5151
// CHECK: cleanup.cont:{{.*}}
52+
// CHECK-NEXT: call void @llvm.lifetime.start.p0(ptr [[AGG:%agg.tmp]])
5253
// CHECK-NEXT: load i8
5354
// CHECK-NEXT: trunc
5455
// CHECK-NEXT: store i1 false
@@ -57,3 +58,6 @@ void foo() {
5758
// CHECK-NOT: call void @llvm.lifetime
5859
// CHECK: call void @llvm.coro.await.suspend.void(
5960
// CHECK-NEXT: %{{[0-9]+}} = call i8 @llvm.coro.suspend(
61+
62+
// CHECK-LABEL: cond.end:
63+
// check call @llvm.lifetime.end.p0(ptr [[AGG]])

0 commit comments

Comments
 (0)