-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[RFC][clang][BPF] Make trivial uninit var value to be 0 #125601
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@llvm/pr-subscribers-clang Author: None (yonghong-song) ChangesMarc Suñé (Isovalent, part of Cisco) reported an issue where an uninitialized variable caused generated bpf prog binary code not working as expected. The reproducer is in [1] where the flags “-Wall -Werror” are enabled, but there is no warning and compiler may take advantage of uninit variable to do aggressive optimization. In the BPF context, the generated buggy code will be load into the kernel and likely verification will pass. When the bpf program actually runs, its result may not be expected and user will then need to check source or asm code to find reasons. Since compiler may take advantage of uninit variables for aggressive optimization, users may be confused with final code without realizing this is due to uninit variables. The ideal case is for compiler to issue meaning info about actual uninit variable which impacts transformation. Currently clang (and gcc) may miss such uninit variables, esp. with cross function boundaries. See discuss in [2]. The llvm/gcc undef sanitizer is another source to detect such undef issues. For bpf target, this means potential complexity for verifier and bpf runtime. Another idea is to make -ftrivial-auto-var-init=0 as the default for bpf target. This way, when bpf prog produced unexpected results, users can check/understand asm codes easier since compiler won't be able to do aggressive optimization due to uninit variables. This patch makes -ftrivial-auto-var-init=0 as the default for bpf target. For example, for code, With -ftrivial-auto-var-init=0 on by default, two bpf selftests failed and below the fix: Not sure whether we should introduce a bpf specific flag to disable -ftrivial-auto-var-init=0 or not. I guess probably not. But the kernel change still needed. [1] https://github.com/msune/clang_bpf/blob/main/Makefile#L3 Full diff: https://github.com/llvm/llvm-project/pull/125601.diff 3 Files Affected:
diff --git a/clang/lib/Basic/Targets/BPF.cpp b/clang/lib/Basic/Targets/BPF.cpp
index f4684765b7ffb32..231a1dc2457bc7d 100644
--- a/clang/lib/Basic/Targets/BPF.cpp
+++ b/clang/lib/Basic/Targets/BPF.cpp
@@ -81,6 +81,14 @@ void BPFTargetInfo::fillValidCPUList(SmallVectorImpl<StringRef> &Values) const {
Values.append(std::begin(ValidCPUNames), std::end(ValidCPUNames));
}
+void BPFTargetInfo::adjust(DiagnosticsEngine &Diags, LangOptions &Opts) {
+ TargetInfo::adjust(Diags, Opts);
+
+ if (Opts.getTrivialAutoVarInit() ==
+ LangOptions::TrivialAutoVarInitKind::Uninitialized)
+ Opts.setTrivialAutoVarInit(LangOptions::TrivialAutoVarInitKind::Zero);
+}
+
ArrayRef<Builtin::Info> BPFTargetInfo::getTargetBuiltins() const {
return llvm::ArrayRef(BuiltinInfo,
clang::BPF::LastTSBuiltin - Builtin::FirstTSBuiltin);
diff --git a/clang/lib/Basic/Targets/BPF.h b/clang/lib/Basic/Targets/BPF.h
index 27a4b5f31497024..b12a09fb2db5c14 100644
--- a/clang/lib/Basic/Targets/BPF.h
+++ b/clang/lib/Basic/Targets/BPF.h
@@ -60,6 +60,8 @@ class LLVM_LIBRARY_VISIBILITY BPFTargetInfo : public TargetInfo {
ArrayRef<Builtin::Info> getTargetBuiltins() const override;
+ void adjust(DiagnosticsEngine &Diags, LangOptions &Opts) override;
+
std::string_view getClobbers() const override { return ""; }
BuiltinVaListKind getBuiltinVaListKind() const override {
diff --git a/clang/test/CodeGen/bpf-auto-var-init.c b/clang/test/CodeGen/bpf-auto-var-init.c
new file mode 100644
index 000000000000000..09023279e1eeeb7
--- /dev/null
+++ b/clang/test/CodeGen/bpf-auto-var-init.c
@@ -0,0 +1,30 @@
+// REQUIRES: bpf-registered-target
+// RUN: %clang_cc1 -triple bpf -O2 -emit-llvm %s -o - | FileCheck %s
+
+int foo1() {
+ int val;
+ return val;
+}
+// CHECK: ret i32 0
+
+int foo2() {
+ int val[4];
+ return val[2];
+}
+// CHECK: ret i32 0
+
+struct val_t {
+ int val;
+};
+
+int foo3() {
+ struct val_t v;
+ return v.val;
+}
+// CHECK: ret i32 0
+
+int foo4() {
+ int val = 5;
+ return val;
+}
+// CHECK: ret i32 5
|
|
what veristat says before/after? re: selftests there should be a flag to disable this, so the tests can remain as-is (with only Makefile change). There is also -ftrivial-auto-var-init-max-size= flag. What is the default? |
Marc Suñé (Isovalent, part of Cisco) reported an issue where an
uninitialized variable caused generated bpf prog binary code not
working as expected. The reproducer is in [1] where the flags
“-Wall -Werror” are enabled, but there is no warning and compiler
may take advantage of uninit variable to do aggressive optimization.
In the BPF context, the generated buggy code will be load into
the kernel and likely verification will pass. When the bpf program
actually runs, its result may not be expected and user will then need
to check source or asm code to find reasons. Since compiler may
take advantage of uninit variables for aggressive optimization,
users may be confused with final code without realizing this is
due to uninit variables.
The ideal case is for compiler to issue meaning info about
*actual* uninit variable which impacts transformation. Currently
clang (and gcc) may miss such uninit variables, esp. with cross
function boundaries. See discuss in [2].
The llvm/gcc undef sanitizer is another source to detect such
undef issues. For bpf target, this means potential complexity for
verifier and bpf runtime.
Another idea is to make -ftrivial-auto-var-init=0 as the default
for bpf target. This way, when bpf prog produced unexpected results,
users can check/understand asm codes easier since compiler won't
be able to do aggressive optimization due to uninit variables.
This patch makes -ftrivial-auto-var-init=0 as the default for bpf target.
For example, for code,
int foo1() {
int val;
return val;
}
With this patch, the eventual IR bofore lowering is
ret i32 0;
Without this patch, it is
ret i32 undef
With -ftrivial-auto-var-init=0 on by default, two bpf selftests failed and
below the fix:
diff --git a/tools/testing/selftests/bpf/progs/dynptr_fail.c b/tools/testing/selftests/bpf/progs/dynptr_fail.c
index bd8f15229f5c..72a65ee365ef 100644
--- a/tools/testing/selftests/bpf/progs/dynptr_fail.c
+++ b/tools/testing/selftests/bpf/progs/dynptr_fail.c
@@ -413,7 +413,11 @@ int invalid_helper1(void *ctx)
/* A dynptr can't be passed into a helper function at a non-zero offset */
SEC("?raw_tp")
+#if __clang_major__ >= 21
+__failure __msg("Expected an initialized dynptr as arg llvm#2")
+#else
__failure __msg("cannot pass in dynptr at an offset=-8")
+#endif
int invalid_helper2(void *ctx)
{
struct bpf_dynptr ptr;
diff --git a/tools/testing/selftests/bpf/progs/verifier_mtu.c b/tools/testing/selftests/bpf/progs/verifier_mtu.c
index 256956ea1ac5..c987b20b39eb 100644
--- a/tools/testing/selftests/bpf/progs/verifier_mtu.c
+++ b/tools/testing/selftests/bpf/progs/verifier_mtu.c
@@ -8,7 +8,9 @@ SEC("tc/ingress")
__description("uninit/mtu: write rejected")
__success
__caps_unpriv(CAP_BPF|CAP_NET_ADMIN)
+#if __clang_major__ < 21
__failure_unpriv __msg_unpriv("invalid read from stack")
+#endif
int tc_uninit_mtu(struct __sk_buff *ctx)
{
__u32 mtu;
Not sure whether we should introduce a bpf specific flag to disable
-ftrivial-auto-var-init=0 or not. I guess probably not. But the
kernel change still needed.
[1] https://github.com/msune/clang_bpf/blob/main/Makefile#L3
[2] https://discourse.llvm.org/t/detect-undefined-behavior-due-to-uninitialized-variables-in-bpf-programs/84116?u=yonghong-song
I tried with latest bpf-next and latest llvm trunk. The following is the difference For cgroup_tcp_skb.c, e.g., the tcphdr is larger so it costs more with initializing to 0. For verifier_array_access.c, the 'key' is initialized.
Will do.
The default is 0 (-ftrivial-auto-var-init-max-size=0) which means there is no limitation.
|
looks like it's a show stopper. The performance impact of initializing everything might be felt for XDP. |
|
I looked at the example shared by Yonghong, the change clang does because of the uninitialized variable access is indeed quite dramatic. It computes a switch over Nevertheless, the issue is not really solved by using zero initialization, as that is not what programmer intended and might be equally as hard to debug. And this is not limited to BPF backend. If we really consider this issue to be a priority for BPF backend, I think we should invest time in better semantic warnings, or otherwise leave things as is. |
3790f38 to
0b1dde3
Compare
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
0b1dde3 to
781c48f
Compare
|
I just updated a patch to change TrivialAutoVarInitMaxSize to 8 if TrivialAutoVarInit is set to Zero by clang bpf target. This should mostly solve the issue for xdp programs. @eddyz87 suggested to improve middle-end compilation to issue proper warnings. This is what I initially wanted to do as well but feel it would be quite complicated esp. related to cross-functions. I am not sure how cross-file checking can be done for bpf programs since effectively, we do not do IR-level merge across files. One thing I tried earlier is to do -fsanitize=undefined for bpf programs (see discourse discussion link in the description). There will be a 'undef' IR asm code to specify as undefined behavior at particular place. Current bpf backend maps 'undef' to a no-op. |
This is to prevent auto init for large data structures, e.g. for xdp prog, iphdr or tcphdr they all more than 8 bytes. If auto init is not really needed, auto init might cause performance regression for those programs.
781c48f to
22a513a
Compare
|
I agree with Eduard. I think we should abandon this patch and make -fsanitize=undefined work. |
|
Okay, I will work on -fsanitize=undefined for bpf then. |
Marc Suñé (Isovalent, part of Cisco) reported an issue where an uninitialized variable caused generated bpf prog binary code not working as expected. The reproducer is in [1] where the flags “-Wall -Werror” are enabled, but there is no warning and compiler may take advantage of uninit variable to do aggressive optimization. In discussion [2], various approaches are discussed, e.g., improving compiler to detect undefined behavior due to uninitialized variables, trying to use ubsan (-fsanitize=undefined), and making -ftrivial-auto-var-init=zero as the bpf default flags. I tried [3] with -ftrivial-auto-var-init=zero and eventually we decided no-go since first it may introduce performance regression and second the prog may still be wrong if the prog expects a non-zero value. The ubsan apprach seems not working as well since it involves runtime callback func ([4]). The approach here is not to do complicate compiler analysis to detect whether where is undef behavior which may impact final codegen. Rather, we relies on compiler to do its normal transformation and at later IR passes stage, a BPF backend pass is inserted to check whether undef behavior is in IR or not. Note that if undef behavior indeed impacts codes, the compiler will discard those related codes with simple 'undef' or 'unreachable'. For example, for the case [1], before SCCPPass, the IR looks like ``` define dso_local i32 @repro(ptr noundef %0) #0 section "classifier" { %2 = alloca %struct.ipv6_opt_hdr, align 8 %3 = tail call i64 (ptr, i32, ...) inttoptr (i64 6 to ptr)(ptr noundef nonnull @repro.____fmt, i32 noundef 6) llvm#2 %4 = tail call ptr asm sideeffect "$0 = *(u32 *)($1 + $2)", "=r,r,i"(ptr %0, i64 76) llvm#2, !srcloc !3 %5 = ptrtoint ptr %4 to i64 %6 = trunc i64 %5 to i32 %7 = tail call i64 (ptr, i32, ...) inttoptr (i64 6 to ptr)(ptr noundef nonnull @icmp6_ndisc_validate.____fmt, i32 noundef 23) llvm#2 call void @llvm.lifetime.start.p0(i64 2, ptr nonnull %2) llvm#2 %8 = getelementptr inbounds nuw i8, ptr %2, i64 1 switch i8 undef, label %51 [ i8 59, label %56 i8 44, label %57 i8 0, label %9 i8 43, label %9 i8 51, label %9 i8 60, label %9 ] 9: ; preds = %1, %1, %1, %1 %10 = sub i32 40, %6 ... ``` Note that 'undef' is used for switch key due to one of early pass LoopFullUnrollPass. After SCCPPass: ``` efine dso_local i32 @repro(ptr noundef %0) #0 section "classifier" { %2 = alloca %struct.ipv6_opt_hdr, align 8 %3 = tail call i64 (ptr, i32, ...) inttoptr (i64 6 to ptr)(ptr noundef nonnull @repro.____fmt, i32 noundef 6) llvm#2 %4 = tail call ptr asm sideeffect "$0 = *(u32 *)($1 + $2)", "=r,r,i"(ptr %0, i64 76) llvm#2, !srcloc !3 %5 = ptrtoint ptr %4 to i64 %6 = trunc i64 %5 to i32 %7 = tail call i64 (ptr, i32, ...) inttoptr (i64 6 to ptr)(ptr noundef nonnull @icmp6_ndisc_validate.____fmt, i32 noundef 23) llvm#2 call void @llvm.lifetime.start.p0(i64 2, ptr nonnull %2) llvm#2 %8 = getelementptr inbounds nuw i8, ptr %2, i64 1 unreachable } ``` For another example, ``` $ cat t.c int foo() { int i[2]; return i[1]; } ``` Before SROAPass pass, ``` define dso_local i32 @foo() #0 { %1 = alloca [2 x i32], align 4 call void @llvm.lifetime.start.p0(i64 8, ptr %1) llvm#2 %2 = getelementptr inbounds [2 x i32], ptr %1, i64 0, i64 1 %3 = load i32, ptr %2, align 4, !tbaa !3 call void @llvm.lifetime.end.p0(i64 8, ptr %1) llvm#2 ret i32 %3 } ``` After SROAPass pass, ``` define dso_local i32 @foo() #0 { ret i32 undef } ``` Besides the above two test cases, the following three patterns are also covered: - It is possible llvm may generate codes where a default branch to 'unreachable' location. Ignore such 'unreachable' instances. See [5] or some comments in [2]. - Handle pattern like __bpf_unreachable (defined in bpf_helpers.h). - Functions with naked attribute will have 'unreachable' at the end of function. Ignore such functions. Tested with bpf selftests and there are no warnings issued. [1] https://github.com/msune/clang_bpf/blob/main/Makefile#L3 [2] https://discourse.llvm.org/t/detect-undefined-behavior-due-to-uninitialized-variables-in-bpf-programs/84116?u=yonghong-song [3] llvm#125601 [4] https://github.com/llvm/llvm-project/blob/main/compiler-rt/lib/ubsan/ubsan_interface.inc [5] https://lore.kernel.org/lkml/[email protected]/
Marc Suñé (Isovalent, part of Cisco) reported an issue where an uninitialized variable caused generated bpf prog binary code not working as expected. The reproducer is in [1] where the flags “-Wall -Werror” are enabled, but there is no warning and compiler may take advantage of uninit variable to do aggressive optimization. In discussion [2], various approaches are discussed, e.g., improving compiler to detect undefined behavior due to uninitialized variables, trying to use ubsan (-fsanitize=undefined), and making -ftrivial-auto-var-init=zero as the bpf default flags. I tried [3] with -ftrivial-auto-var-init=zero and eventually we decided no-go since first it may introduce performance regression and second the prog may still be wrong if the prog expects a non-zero value. The ubsan apprach seems not working as well since it involves runtime callback func ([4]). The approach here is not to do complicate compiler analysis to detect whether where is undef behavior which may impact final codegen. Rather, we relies on compiler to do its normal transformation and at later IR passes stage, a BPF backend pass is inserted to check whether undef behavior is in IR or not. Note that if undef behavior indeed impacts codes, the compiler will discard those related codes with simple 'undef' or 'unreachable'. For example, for the case [1], before SCCPPass, the IR looks like ``` define dso_local i32 @repro(ptr noundef %0) #0 section "classifier" { %2 = alloca %struct.ipv6_opt_hdr, align 8 %3 = tail call i64 (ptr, i32, ...) inttoptr (i64 6 to ptr)(ptr noundef nonnull @repro.____fmt, i32 noundef 6) llvm#2 %4 = tail call ptr asm sideeffect "$0 = *(u32 *)($1 + $2)", "=r,r,i"(ptr %0, i64 76) llvm#2, !srcloc !3 %5 = ptrtoint ptr %4 to i64 %6 = trunc i64 %5 to i32 %7 = tail call i64 (ptr, i32, ...) inttoptr (i64 6 to ptr)(ptr noundef nonnull @icmp6_ndisc_validate.____fmt, i32 noundef 23) llvm#2 call void @llvm.lifetime.start.p0(i64 2, ptr nonnull %2) llvm#2 %8 = getelementptr inbounds nuw i8, ptr %2, i64 1 switch i8 undef, label %51 [ i8 59, label %56 i8 44, label %57 i8 0, label %9 i8 43, label %9 i8 51, label %9 i8 60, label %9 ] 9: ; preds = %1, %1, %1, %1 %10 = sub i32 40, %6 ... ``` Note that 'undef' is used for switch key due to one of early pass LoopFullUnrollPass. After SCCPPass: ``` efine dso_local i32 @repro(ptr noundef %0) #0 section "classifier" { %2 = alloca %struct.ipv6_opt_hdr, align 8 %3 = tail call i64 (ptr, i32, ...) inttoptr (i64 6 to ptr)(ptr noundef nonnull @repro.____fmt, i32 noundef 6) llvm#2 %4 = tail call ptr asm sideeffect "$0 = *(u32 *)($1 + $2)", "=r,r,i"(ptr %0, i64 76) llvm#2, !srcloc !3 %5 = ptrtoint ptr %4 to i64 %6 = trunc i64 %5 to i32 %7 = tail call i64 (ptr, i32, ...) inttoptr (i64 6 to ptr)(ptr noundef nonnull @icmp6_ndisc_validate.____fmt, i32 noundef 23) llvm#2 call void @llvm.lifetime.start.p0(i64 2, ptr nonnull %2) llvm#2 %8 = getelementptr inbounds nuw i8, ptr %2, i64 1 unreachable } ``` For another example, ``` $ cat t.c int foo() { int i[2]; return i[1]; } ``` Before SROAPass pass, ``` define dso_local i32 @foo() #0 { %1 = alloca [2 x i32], align 4 call void @llvm.lifetime.start.p0(i64 8, ptr %1) llvm#2 %2 = getelementptr inbounds [2 x i32], ptr %1, i64 0, i64 1 %3 = load i32, ptr %2, align 4, !tbaa !3 call void @llvm.lifetime.end.p0(i64 8, ptr %1) llvm#2 ret i32 %3 } ``` After SROAPass pass, ``` define dso_local i32 @foo() #0 { ret i32 undef } ``` Besides the above two test cases, the following three patterns are also covered: - It is possible llvm may generate codes where a default branch to 'unreachable' location. Ignore such 'unreachable' instances. See [5] or some comments in [2]. - Handle pattern like __bpf_unreachable (defined in bpf_helpers.h). - Functions with naked attribute will have 'unreachable' at the end of function. Ignore such functions. Tested with bpf selftests and there are no warnings issued. [1] https://github.com/msune/clang_bpf/blob/main/Makefile#L3 [2] https://discourse.llvm.org/t/detect-undefined-behavior-due-to-uninitialized-variables-in-bpf-programs/84116?u=yonghong-song [3] llvm#125601 [4] https://github.com/llvm/llvm-project/blob/main/compiler-rt/lib/ubsan/ubsan_interface.inc [5] https://lore.kernel.org/lkml/[email protected]/
Marc Suñé (Isovalent, part of Cisco) reported an issue where an uninitialized variable caused generated bpf prog binary code not working as expected. The reproducer is in [1] where the flags “-Wall -Werror” are enabled, but there is no warning and compiler may take advantage of uninit variable to do aggressive optimization. In discussion [2], various approaches are discussed, e.g., improving compiler to detect undefined behavior due to uninitialized variables, trying to use ubsan (-fsanitize=undefined), and making -ftrivial-auto-var-init=zero as the bpf default flags. I tried [3] with -ftrivial-auto-var-init=zero and eventually we decided no-go since first it may introduce performance regression and second the prog may still be wrong if the prog expects a non-zero value. The ubsan apprach seems not working as well since it involves runtime callback func ([4]). The approach here is not to do complicate compiler analysis to detect whether where is undef behavior which may impact final codegen. Rather, we relies on compiler to do its normal transformation and at later IR passes stage, a BPF backend pass is inserted to check whether undef behavior is in IR or not. Note that if undef behavior indeed impacts codes, the compiler will discard those related codes with simple 'undef' or 'unreachable'. For example, for the case [1], before SCCPPass, the IR looks like ``` define dso_local i32 @repro(ptr noundef %0) #0 section "classifier" { %2 = alloca %struct.ipv6_opt_hdr, align 8 %3 = tail call i64 (ptr, i32, ...) inttoptr (i64 6 to ptr)(ptr noundef nonnull @repro.____fmt, i32 noundef 6) llvm#2 %4 = tail call ptr asm sideeffect "$0 = *(u32 *)($1 + $2)", "=r,r,i"(ptr %0, i64 76) llvm#2, !srcloc !3 %5 = ptrtoint ptr %4 to i64 %6 = trunc i64 %5 to i32 %7 = tail call i64 (ptr, i32, ...) inttoptr (i64 6 to ptr)(ptr noundef nonnull @icmp6_ndisc_validate.____fmt, i32 noundef 23) llvm#2 call void @llvm.lifetime.start.p0(i64 2, ptr nonnull %2) llvm#2 %8 = getelementptr inbounds nuw i8, ptr %2, i64 1 switch i8 undef, label %51 [ i8 59, label %56 i8 44, label %57 i8 0, label %9 i8 43, label %9 i8 51, label %9 i8 60, label %9 ] 9: ; preds = %1, %1, %1, %1 %10 = sub i32 40, %6 ... ``` Note that 'undef' is used for switch key due to one of early pass LoopFullUnrollPass. After SCCPPass: ``` efine dso_local i32 @repro(ptr noundef %0) #0 section "classifier" { %2 = alloca %struct.ipv6_opt_hdr, align 8 %3 = tail call i64 (ptr, i32, ...) inttoptr (i64 6 to ptr)(ptr noundef nonnull @repro.____fmt, i32 noundef 6) llvm#2 %4 = tail call ptr asm sideeffect "$0 = *(u32 *)($1 + $2)", "=r,r,i"(ptr %0, i64 76) llvm#2, !srcloc !3 %5 = ptrtoint ptr %4 to i64 %6 = trunc i64 %5 to i32 %7 = tail call i64 (ptr, i32, ...) inttoptr (i64 6 to ptr)(ptr noundef nonnull @icmp6_ndisc_validate.____fmt, i32 noundef 23) llvm#2 call void @llvm.lifetime.start.p0(i64 2, ptr nonnull %2) llvm#2 %8 = getelementptr inbounds nuw i8, ptr %2, i64 1 unreachable } ``` For another example, ``` $ cat t.c int foo() { int i[2]; return i[1]; } ``` Before SROAPass pass, ``` define dso_local i32 @foo() #0 { %1 = alloca [2 x i32], align 4 call void @llvm.lifetime.start.p0(i64 8, ptr %1) llvm#2 %2 = getelementptr inbounds [2 x i32], ptr %1, i64 0, i64 1 %3 = load i32, ptr %2, align 4, !tbaa !3 call void @llvm.lifetime.end.p0(i64 8, ptr %1) llvm#2 ret i32 %3 } ``` After SROAPass pass, ``` define dso_local i32 @foo() #0 { ret i32 undef } ``` Besides the above two test cases, the following three patterns are also covered: - It is possible llvm may generate codes where a default branch to 'unreachable' location. Ignore such 'unreachable' instances. See [5] or some comments in [2]. - Handle pattern like __bpf_unreachable (defined in bpf_helpers.h). - Functions with naked attribute will have 'unreachable' at the end of function. Ignore such functions. Tested with bpf selftests and there are no warnings issued. [1] https://github.com/msune/clang_bpf/blob/main/Makefile#L3 [2] https://discourse.llvm.org/t/detect-undefined-behavior-due-to-uninitialized-variables-in-bpf-programs/84116?u=yonghong-song [3] llvm#125601 [4] https://github.com/llvm/llvm-project/blob/main/compiler-rt/lib/ubsan/ubsan_interface.inc [5] https://lore.kernel.org/lkml/[email protected]/
Marc Suñé (Isovalent, part of Cisco) reported an issue where an uninitialized variable caused generated bpf prog binary code not working as expected. The reproducer is in [1] where the flags “-Wall -Werror” are enabled, but there is no warning and compiler may take advantage of uninit variable to do aggressive optimization. In discussion [2], various approaches are discussed, e.g., improving compiler to detect undefined behavior due to uninitialized variables, trying to use ubsan (-fsanitize=undefined), and making -ftrivial-auto-var-init=zero as the bpf default flags. I tried [3] with -ftrivial-auto-var-init=zero and eventually we decided no-go since first it may introduce performance regression and second the prog may still be wrong if the prog expects a non-zero value. The ubsan apprach seems not working as well since it involves runtime callback func ([4]). The approach here is not to do complicate compiler analysis to detect whether where is undef behavior which may impact final codegen. Rather, we relies on compiler to do its normal transformation and at later IR passes stage, a BPF backend pass is inserted to check whether undef behavior is in IR or not. Note that if undef behavior indeed impacts codes, the compiler will discard those related codes with simple 'undef' or 'unreachable'. For example, for the case [1], before SCCPPass, the IR looks like ``` define dso_local i32 @repro(ptr noundef %0) #0 section "classifier" { %2 = alloca %struct.ipv6_opt_hdr, align 8 %3 = tail call i64 (ptr, i32, ...) inttoptr (i64 6 to ptr)(ptr noundef nonnull @repro.____fmt, i32 noundef 6) llvm#2 %4 = tail call ptr asm sideeffect "$0 = *(u32 *)($1 + $2)", "=r,r,i"(ptr %0, i64 76) llvm#2, !srcloc !3 %5 = ptrtoint ptr %4 to i64 %6 = trunc i64 %5 to i32 %7 = tail call i64 (ptr, i32, ...) inttoptr (i64 6 to ptr)(ptr noundef nonnull @icmp6_ndisc_validate.____fmt, i32 noundef 23) llvm#2 call void @llvm.lifetime.start.p0(i64 2, ptr nonnull %2) llvm#2 %8 = getelementptr inbounds nuw i8, ptr %2, i64 1 switch i8 undef, label %51 [ i8 59, label %56 i8 44, label %57 i8 0, label %9 i8 43, label %9 i8 51, label %9 i8 60, label %9 ] 9: ; preds = %1, %1, %1, %1 %10 = sub i32 40, %6 ... ``` Note that 'undef' is used for switch key due to one of early pass LoopFullUnrollPass. After SCCPPass: ``` efine dso_local i32 @repro(ptr noundef %0) #0 section "classifier" { %2 = alloca %struct.ipv6_opt_hdr, align 8 %3 = tail call i64 (ptr, i32, ...) inttoptr (i64 6 to ptr)(ptr noundef nonnull @repro.____fmt, i32 noundef 6) llvm#2 %4 = tail call ptr asm sideeffect "$0 = *(u32 *)($1 + $2)", "=r,r,i"(ptr %0, i64 76) llvm#2, !srcloc !3 %5 = ptrtoint ptr %4 to i64 %6 = trunc i64 %5 to i32 %7 = tail call i64 (ptr, i32, ...) inttoptr (i64 6 to ptr)(ptr noundef nonnull @icmp6_ndisc_validate.____fmt, i32 noundef 23) llvm#2 call void @llvm.lifetime.start.p0(i64 2, ptr nonnull %2) llvm#2 %8 = getelementptr inbounds nuw i8, ptr %2, i64 1 unreachable } ``` For another example, ``` $ cat t.c int foo() { int i[2]; return i[1]; } ``` Before SROAPass pass, ``` define dso_local i32 @foo() #0 { %1 = alloca [2 x i32], align 4 call void @llvm.lifetime.start.p0(i64 8, ptr %1) llvm#2 %2 = getelementptr inbounds [2 x i32], ptr %1, i64 0, i64 1 %3 = load i32, ptr %2, align 4, !tbaa !3 call void @llvm.lifetime.end.p0(i64 8, ptr %1) llvm#2 ret i32 %3 } ``` After SROAPass pass, ``` define dso_local i32 @foo() #0 { ret i32 undef } ``` Besides the above two test cases, the following three patterns are also covered: - It is possible llvm may generate codes where a default branch to 'unreachable' location. Ignore such 'unreachable' instances. See [5] or some comments in [2]. - Handle pattern like __bpf_unreachable (defined in bpf_helpers.h). - Functions with naked attribute will have 'unreachable' at the end of function. Ignore such functions. Tested with bpf selftests and there are no warnings issued. [1] https://github.com/msune/clang_bpf/blob/main/Makefile#L3 [2] https://discourse.llvm.org/t/detect-undefined-behavior-due-to-uninitialized-variables-in-bpf-programs/84116?u=yonghong-song [3] llvm#125601 [4] https://github.com/llvm/llvm-project/blob/main/compiler-rt/lib/ubsan/ubsan_interface.inc [5] https://lore.kernel.org/lkml/[email protected]/
Marc Suñé (Isovalent, part of Cisco) reported an issue where an uninitialized variable caused generated bpf prog binary code not working as expected. The reproducer is in [1] where the flags “-Wall -Werror” are enabled, but there is no warning and compiler may take advantage of uninit variable to do aggressive optimization. In discussion [2], various approaches are discussed, e.g., improving compiler to detect undefined behavior due to uninitialized variables, trying to use ubsan (-fsanitize=undefined), and making -ftrivial-auto-var-init=zero as the bpf default flags. I tried [3] with -ftrivial-auto-var-init=zero and eventually we decided no-go since first it may introduce performance regression and second the prog may still be wrong if the prog expects a non-zero value. The ubsan apprach seems not working as well since it involves runtime callback func ([4]). The approach here is not to do complicate compiler analysis to detect whether where is undef behavior which may impact final codegen. Rather, we relies on compiler to do its normal transformation and at later IR passes stage, a BPF backend pass is inserted to check whether undef behavior is in IR or not. Note that if undef behavior indeed impacts codes, the compiler will discard those related codes with simple 'undef' or 'unreachable'. For example, for the case [1], before SCCPPass, the IR looks like ``` define dso_local i32 @repro(ptr noundef %0) #0 section "classifier" { %2 = alloca %struct.ipv6_opt_hdr, align 8 %3 = tail call i64 (ptr, i32, ...) inttoptr (i64 6 to ptr)(ptr noundef nonnull @repro.____fmt, i32 noundef 6) llvm#2 %4 = tail call ptr asm sideeffect "$0 = *(u32 *)($1 + $2)", "=r,r,i"(ptr %0, i64 76) llvm#2, !srcloc !3 %5 = ptrtoint ptr %4 to i64 %6 = trunc i64 %5 to i32 %7 = tail call i64 (ptr, i32, ...) inttoptr (i64 6 to ptr)(ptr noundef nonnull @icmp6_ndisc_validate.____fmt, i32 noundef 23) llvm#2 call void @llvm.lifetime.start.p0(i64 2, ptr nonnull %2) llvm#2 %8 = getelementptr inbounds nuw i8, ptr %2, i64 1 switch i8 undef, label %51 [ i8 59, label %56 i8 44, label %57 i8 0, label %9 i8 43, label %9 i8 51, label %9 i8 60, label %9 ] 9: ; preds = %1, %1, %1, %1 %10 = sub i32 40, %6 ... ``` Note that 'undef' is used for switch key due to one of early pass LoopFullUnrollPass. After SCCPPass: ``` efine dso_local i32 @repro(ptr noundef %0) #0 section "classifier" { %2 = alloca %struct.ipv6_opt_hdr, align 8 %3 = tail call i64 (ptr, i32, ...) inttoptr (i64 6 to ptr)(ptr noundef nonnull @repro.____fmt, i32 noundef 6) llvm#2 %4 = tail call ptr asm sideeffect "$0 = *(u32 *)($1 + $2)", "=r,r,i"(ptr %0, i64 76) llvm#2, !srcloc !3 %5 = ptrtoint ptr %4 to i64 %6 = trunc i64 %5 to i32 %7 = tail call i64 (ptr, i32, ...) inttoptr (i64 6 to ptr)(ptr noundef nonnull @icmp6_ndisc_validate.____fmt, i32 noundef 23) llvm#2 call void @llvm.lifetime.start.p0(i64 2, ptr nonnull %2) llvm#2 %8 = getelementptr inbounds nuw i8, ptr %2, i64 1 unreachable } ``` For another example, ``` $ cat t.c int foo() { int i[2]; return i[1]; } ``` Before SROAPass pass, ``` define dso_local i32 @foo() #0 { %1 = alloca [2 x i32], align 4 call void @llvm.lifetime.start.p0(i64 8, ptr %1) llvm#2 %2 = getelementptr inbounds [2 x i32], ptr %1, i64 0, i64 1 %3 = load i32, ptr %2, align 4, !tbaa !3 call void @llvm.lifetime.end.p0(i64 8, ptr %1) llvm#2 ret i32 %3 } ``` After SROAPass pass, ``` define dso_local i32 @foo() #0 { ret i32 undef } ``` Besides the above two test cases, the following three patterns are also covered: - It is possible llvm may generate codes where a default branch to 'unreachable' location. Ignore such 'unreachable' instances. See [5] or some comments in [2]. - Handle pattern like __bpf_unreachable (defined in bpf_helpers.h). - Functions with naked attribute will have 'unreachable' at the end of function. Ignore such functions. Tested with bpf selftests and there are no warnings issued. [1] https://github.com/msune/clang_bpf/blob/main/Makefile#L3 [2] https://discourse.llvm.org/t/detect-undefined-behavior-due-to-uninitialized-variables-in-bpf-programs/84116?u=yonghong-song [3] llvm#125601 [4] https://github.com/llvm/llvm-project/blob/main/compiler-rt/lib/ubsan/ubsan_interface.inc [5] https://lore.kernel.org/lkml/[email protected]/
Marc Suñé (Isovalent, part of Cisco) reported an issue where an uninitialized variable caused generated bpf prog binary code not working as expected. The reproducer is in [1] where the flags “-Wall -Werror” are enabled, but there is no warning and compiler may take advantage of uninit variable to do aggressive optimization. In discussion [2], various approaches are discussed, e.g., improving compiler to detect undefined behavior due to uninitialized variables, trying to use ubsan (-fsanitize=undefined), and making -ftrivial-auto-var-init=zero as the bpf default flags. I tried [3] with -ftrivial-auto-var-init=zero and eventually we decided no-go since first it may introduce performance regression and second the prog may still be wrong if the prog expects a non-zero value. The ubsan apprach seems not working as well since it involves runtime callback func ([4]). The approach here is not to do complicate compiler analysis to detect whether where is undef behavior which may impact final codegen. Rather, we relies on compiler to do its normal transformation and at later IR passes stage, a BPF backend pass is inserted to check whether undef behavior is in IR or not. Note that if undef behavior indeed impacts codes, the compiler will discard those related codes with simple 'undef' or 'unreachable'. For example, for the case [1], before SCCPPass, the IR looks like ``` define dso_local i32 @repro(ptr noundef %0) #0 section "classifier" { %2 = alloca %struct.ipv6_opt_hdr, align 8 %3 = tail call i64 (ptr, i32, ...) inttoptr (i64 6 to ptr)(ptr noundef nonnull @repro.____fmt, i32 noundef 6) llvm#2 %4 = tail call ptr asm sideeffect "$0 = *(u32 *)($1 + $2)", "=r,r,i"(ptr %0, i64 76) llvm#2, !srcloc !3 %5 = ptrtoint ptr %4 to i64 %6 = trunc i64 %5 to i32 %7 = tail call i64 (ptr, i32, ...) inttoptr (i64 6 to ptr)(ptr noundef nonnull @icmp6_ndisc_validate.____fmt, i32 noundef 23) llvm#2 call void @llvm.lifetime.start.p0(i64 2, ptr nonnull %2) llvm#2 %8 = getelementptr inbounds nuw i8, ptr %2, i64 1 switch i8 undef, label %51 [ i8 59, label %56 i8 44, label %57 i8 0, label %9 i8 43, label %9 i8 51, label %9 i8 60, label %9 ] 9: ; preds = %1, %1, %1, %1 %10 = sub i32 40, %6 ... ``` Note that 'undef' is used for switch key due to one of early pass LoopFullUnrollPass. After SCCPPass: ``` efine dso_local i32 @repro(ptr noundef %0) #0 section "classifier" { %2 = alloca %struct.ipv6_opt_hdr, align 8 %3 = tail call i64 (ptr, i32, ...) inttoptr (i64 6 to ptr)(ptr noundef nonnull @repro.____fmt, i32 noundef 6) llvm#2 %4 = tail call ptr asm sideeffect "$0 = *(u32 *)($1 + $2)", "=r,r,i"(ptr %0, i64 76) llvm#2, !srcloc !3 %5 = ptrtoint ptr %4 to i64 %6 = trunc i64 %5 to i32 %7 = tail call i64 (ptr, i32, ...) inttoptr (i64 6 to ptr)(ptr noundef nonnull @icmp6_ndisc_validate.____fmt, i32 noundef 23) llvm#2 call void @llvm.lifetime.start.p0(i64 2, ptr nonnull %2) llvm#2 %8 = getelementptr inbounds nuw i8, ptr %2, i64 1 unreachable } ``` For another example, ``` $ cat t.c int foo() { int i[2]; return i[1]; } ``` Before SROAPass pass, ``` define dso_local i32 @foo() #0 { %1 = alloca [2 x i32], align 4 call void @llvm.lifetime.start.p0(i64 8, ptr %1) llvm#2 %2 = getelementptr inbounds [2 x i32], ptr %1, i64 0, i64 1 %3 = load i32, ptr %2, align 4, !tbaa !3 call void @llvm.lifetime.end.p0(i64 8, ptr %1) llvm#2 ret i32 %3 } ``` After SROAPass pass, ``` define dso_local i32 @foo() #0 { ret i32 undef } ``` Besides the above two test cases, the following three patterns are also covered: - It is possible llvm may generate codes where a default branch to 'unreachable' location. Ignore such 'unreachable' instances. See [5] or some comments in [2]. - Handle pattern like __bpf_unreachable (defined in bpf_helpers.h). - Functions with naked attribute will have 'unreachable' at the end of function. Ignore such functions. Tested with bpf selftests and there are no warnings issued. [1] https://github.com/msune/clang_bpf/blob/main/Makefile#L3 [2] https://discourse.llvm.org/t/detect-undefined-behavior-due-to-uninitialized-variables-in-bpf-programs/84116?u=yonghong-song [3] llvm#125601 [4] https://github.com/llvm/llvm-project/blob/main/compiler-rt/lib/ubsan/ubsan_interface.inc [5] https://lore.kernel.org/lkml/[email protected]/
Marc Suñé (Isovalent, part of Cisco) reported an issue where an uninitialized variable caused generated bpf prog binary code not working as expected. The reproducer is in [1] where the flags “-Wall -Werror” are enabled, but there is no warning and compiler may take advantage of uninit variable to do aggressive optimization.
In the BPF context, the generated buggy code will be load into the kernel and likely verification will pass. When the bpf program actually runs, its result may not be expected and user will then need to check source or asm code to find reasons. Since compiler may take advantage of uninit variables for aggressive optimization, users may be confused with final code without realizing this is due to uninit variables.
The ideal case is for compiler to issue meaning info about actual uninit variable which impacts transformation. Currently clang (and gcc) may miss such uninit variables, esp. with cross function boundaries. See discuss in [2].
The llvm/gcc undef sanitizer is another source to detect such undef issues. For bpf target, this means potential complexity for verifier and bpf runtime.
Another idea is to make -ftrivial-auto-var-init=0 as the default for bpf target. This way, when bpf prog produced unexpected results, users can check/understand asm codes easier since compiler won't be able to do aggressive optimization due to uninit variables.
This patch makes -ftrivial-auto-var-init=0 as the default for bpf target. For example, for code,
With -ftrivial-auto-var-init=0 on by default, two bpf selftests failed and below the fix:
Not sure whether we should introduce a bpf specific flag to disable -ftrivial-auto-var-init=zero or not. I guess probably not. But the kernel change still needed.
[1] https://github.com/msune/clang_bpf/blob/main/Makefile#L3
[2] https://discourse.llvm.org/t/detect-undefined-behavior-due-to-uninitialized-variables-in-bpf-programs/84116?u=yonghong-song