-
Notifications
You must be signed in to change notification settings - Fork 14.9k
[BPF] Support for DW_TAG_variant_part
in BTF generation
#155783
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This behavior can be reproduced only with Rust. However, this trick cleans up the noisy IR produced by Rust from the panic handler and all unnecessary core types. I hope it's good enough for the test.
✅ With the latest revision this PR passed the undef deprecator. |
d634545
to
8fad84a
Compare
This |
8fad84a
to
f295fba
Compare
It goes away after removing the single-value variant: diff --git a/data-carrying-ebpf/src/main.rs b/data-carrying-ebpf/src/main.rs
index d472859..a4f5f33 100644
--- a/data-carrying-ebpf/src/main.rs
+++ b/data-carrying-ebpf/src/main.rs
@@ -4,15 +4,12 @@
pub enum DataCarryingEnum {
First { a: u32, b: i32 },
Second(u32, i32),
- Third(u32),
}
#[unsafe(no_mangle)]
pub static X: DataCarryingEnum = DataCarryingEnum::First { a: 54, b: -23 };
#[unsafe(no_mangle)]
pub static Y: DataCarryingEnum = DataCarryingEnum::Second(54, -23);
-#[unsafe(no_mangle)]
-pub static Z: DataCarryingEnum = DataCarryingEnum::Third(36);
#[cfg(not(test))]
#[panic_handler] Definitely something to report and fix in Rust. For now, to unblock this fix, I'm sticking to the first two variants. |
f295fba
to
1de07f4
Compare
Let's change the name to: |
1de07f4
to
7a09868
Compare
✅ With the latest revision this PR passed the C/C++ code formatter. |
7a09868
to
413a98a
Compare
DW_TAG_variant_part
in BTF generation
OK, it really looks like I need to fix the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good except for the note about HasBitField from Yonghong. Do you plan to wrap this up?
I do, but I'm blocked by Rust which is emitting |
But we know how rust represents the enum, you don't need rust to wrap up the test case. E.g. the IR below is all you need. It can even be cleaned up a bit.
|
Oh, so in your case rustc didn't emit vadorovsky@413a98a#diff-f411e8cbe94fed63916ca9eebac7652eddbb2db4770934eb94c26632216cbc58R42 The difference in our codes that I added one more constant for the other variant and used I'm AFK ATM. Tomorrow I will try to produce a similar IR without (The issue with Rust emitting |
It's a backend test. When I write such tests I start from something generated by frontend but cut away any unrelated stuff. In this particular case the only relevant thing is the structure of debug information generated for variant/variant_part. You can drop initialization from those globals and it would still be fine for the backend test.
rustc version 1.88.0 (6b00bc388 2025-06-23) (Red Hat 1.88.0-1.el9) |
413a98a
to
a22acb8
Compare
@eddyz87 After playing a bit with your and my code, I realized that Rust emits All comments should be addressed now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this change is fine, but just realized an unfortunate quirk regarding discriminator handling. Consider the following example:
pub enum Adt {
First { a: u32, b: i32 },
Second(u32, i32),
}
pub static X: Adt = Adt::First{a:0, b:0};
With corresponding IR:
!7 = !DICompositeType(tag: DW_TAG_variant_part, scope: !4, file: !5, size: 96, align: 32, elements: !8, ..., discriminator: !22)
!8 = !{!9, !17}
!9 = !DIDerivedType(tag: DW_TAG_member, name: "First", scope: !7, file: !5, baseType: !10, size: 96, align: 32, extraData: i32 0)
!10 = !DICompositeType(tag: DW_TAG_structure_type, name: "First", ..., elements: !11, ...)
!11 = !{!12, !14}
!12 = !DIDerivedType(tag: DW_TAG_member, name: "a", scope: !10, file: !5, baseType: !13, size: 32, align: 32, offset: 32, flags: DIFlagPublic)
...
!22 = !DIDerivedType(tag: DW_TAG_member, scope: !4, file: !5, baseType: !13, size: 32, align: 32, flags: DIFlagArtificial)
And corresponding DWARF:
0x0000003d: DW_TAG_structure_type
DW_AT_name ("Adt")
DW_AT_byte_size (0x0c)
DW_AT_accessibility (DW_ACCESS_public)
DW_AT_alignment (4)
0x00000045: DW_TAG_variant_part
DW_AT_discr (0x0000004a)
0x0000004a: DW_TAG_member
DW_AT_type (0x000000b2 "u32")
DW_AT_alignment (4)
DW_AT_data_member_location (0x00)
DW_AT_artificial (true)
0x00000051: DW_TAG_variant
DW_AT_discr_value (0x00)
0x00000053: DW_TAG_member
DW_AT_name ("First")
DW_AT_type (0x0000006e "test_debug_info::Adt::First")
DW_AT_alignment (4)
DW_AT_data_member_location (0x00)
0x0000005e: NULL
0x0000005f: DW_TAG_variant
DW_AT_discr_value (0x01)
0x00000061: DW_TAG_member
DW_AT_name ("Second")
DW_AT_type (0x0000008f "test_debug_info::Adt::Second")
DW_AT_alignment (4)
DW_AT_data_member_location (0x00)
0x0000006c: NULL
0x0000006d: NULL
0x0000006e: DW_TAG_structure_type
DW_AT_name ("First")
DW_AT_byte_size (0x0c)
DW_AT_accessibility (DW_ACCESS_public)
DW_AT_alignment (4)
0x00000076: DW_TAG_member
DW_AT_name ("a")
DW_AT_type (0x000000b2 "u32")
DW_AT_alignment (4)
DW_AT_data_member_location (0x04)
DW_AT_accessibility (DW_ACCESS_public)
0x00000082: DW_TAG_member
DW_AT_name ("b")
DW_AT_type (0x000000b9 "i32")
DW_AT_alignment (4)
DW_AT_data_member_location (0x08)
DW_AT_accessibility (DW_ACCESS_public)
0x0000008e: NULL
Note how offsets for a
and b
are shifted by 4 bytes and variant part itself has three members, beside First and Second there is also an anonymous member 0x0000004a
, representing a descriminator at offset 0.
Meaning that in BTF union representing the variant part has to have three members, not two. Which is a bit inconvenient, as it is not a part of "elements", but instead a separate "discriminator" reference.
Wdyt?
Good catch!
Given that:
Wouldn't it be more correct to represent the variant part as a struct with two members - discriminant and union (which then contains the variants as elements)? Another option would be extending BTF to actually represent the variant part in a similar way to either LLVM DI or DWARF, showing the discriminant and variants explicitly. |
This makes sense, but will require adjusting member offsets, compared to what LLVM describes in DI. I'd avoid such complication at the moment, but you can give it a try if you want. Note that it will have to be done in a generic way, meaning that we can't hard code that discriminant is always 4 bytes and at offset 0, this info would have to be extracted from DI. Also note, that similar reconstruction will heed to happen in pahole, eventually, when BTF generated for rust kernel code would become important. Pahole generates BTF from DWARF.
Given that there are alternative options: a union with additional member for discriminator, or a struct with discriminator and a union, I don't think kernel upstream would be happy to extend BTF. |
For BPF, I think we can do @eddyz87 suggested below: For non-BTF, we probably cannot do much in llvm and pahole needs to do above conversion. Is it possible for rust frontend to generate easier debuginfo which can be easily mapped to BTF? |
Alright, I will go forward with the union and offsets then. Hopefully I can get it working by tomorrow.
Yes, and I'm willing to try implementing that myself in pahole, once this PR is merged, unless there is someone else already planning such work. |
a22acb8
to
aed0232
Compare
@eddyz87 @yonghong-song I got it working and went with the additional member solution. |
aed0232
to
be94a5f
Compare
llvm/lib/Target/BPF/BTFDebug.cpp
Outdated
struct BTF::BTFMember Discriminator; | ||
const auto *DDTy = STy->getDiscriminator(); | ||
|
||
InitialOffset += DDTy->getOffsetInBits() + DDTy->getSizeInBits(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is no chance for getOffsetInBits()
being anything else than 0 here, but I'm still including it to make the implementation generic instead of just assuming things. But let me know if you have other thoughts.
be94a5f
to
c3d8eb4
Compare
llvm/lib/Target/BPF/BTFDebug.cpp
Outdated
struct BTF::BTFMember Discriminator; | ||
const auto *DDTy = STy->getDiscriminator(); | ||
|
||
InitialOffset += DDTy->getOffsetInBits() + DDTy->getSizeInBits(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This initial offset computation is wrong. Consider the example below:
$ rustc --emit=llvm-ir -C debuginfo=full --crate-type=lib -o - test-debug-info.rs | llc --mtriple=bpfel -filetype=obj -o test.o && bpftool btf dump file test.o
[1] STRUCT 'Adt' size=12 vlen=1
'(anon)' type_id=2 bits_offset=0
[2] UNION '(anon)' size=12 vlen=3
'(anon)' type_id=4 bits_offset=0
'First' type_id=3 bits_offset=32 // <----------- offset 32 <------ cumulative offset 64
'Second' type_id=6 bits_offset=32 /
[3] STRUCT 'First' size=12 vlen=2 /
'a' type_id=4 bits_offset=32 // <-------------- offset 32 <----
'b' type_id=5 bits_offset=64
[...]
{llvm} [~/tmp]
$ rustc --emit=llvm-ir -C debuginfo=full --crate-type=lib -o - test-debug-info.rs | llc --mtriple=bpfel -filetype=obj -o - | llvm-dwarfdump -
[...]
0x0000003d: DW_TAG_structure_type
DW_AT_name ("Adt")
DW_AT_byte_size (0x0c)
DW_AT_accessibility (DW_ACCESS_public)
DW_AT_alignment (4)
DW_AT_accessibility (DW_ACCESS_public) DW_AT_alignment (4)
0x00000045: DW_TAG_variant_part
DW_AT_discr (0x0000004a)
0x0000004a: DW_TAG_member
DW_AT_type (0x000000b2 "u32")
DW_AT_alignment (4)
DW_AT_data_member_location (0x00)
DW_AT_artificial (true)
0x00000051: DW_TAG_variant
DW_AT_discr_value (0x00)
0x00000053: DW_TAG_member
DW_AT_name ("First")
DW_AT_type (0x0000006e "test_debug_info::Adt::First")
DW_AT_alignment (4)
DW_AT_data_member_location (0x00) <---------------------------- offset 0
0x0000005e: NULL
[...]
0x0000006e: DW_TAG_structure_type
DW_AT_name ("First")
DW_AT_byte_size (0x0c)
DW_AT_accessibility (DW_ACCESS_public)
DW_AT_alignment (4)
0x00000076: DW_TAG_member
DW_AT_name ("a")
DW_AT_type (0x000000b2 "u32")
DW_AT_alignment (4)
DW_AT_data_member_location (0x04) <--------------------------- offset 32
DW_AT_accessibility (DW_ACCESS_public)
[...]
Note how DWARF and BTF encode different offsets for field a
.
To correctly handle initial offset, it has to be propagated into the types, referenced in variant part.
But this is complicated, error-prone, and tightly coupled with specifics of DWARF, produced by rust frontend (not guaranteed to be stable).
Let's just drop the initial offset computation and keep discriminant as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My worry is that we don't do the offset computation correctly, the BTF relocations done on variant parts won't work. Hypocritical of me given that I didn't get it right here (I'm very sorry, I'm still new to LLVM and compilers), but I really want to do it right. I'm planning to add BTF relocations for Rust BPF programs eventually (we want to emit them it in bpf-linker, given that we traverse patch bc/IR there) and it would be bad if I have to add yet another LLVM patch to fix it post-factum.
Regarding stability - I think the variant parts annotated with #[repr(C)]
are guaranteed to be stable. #[repr(Rust)]
could change the layout any time, but I think it's fair to require the usage of #[repr(C)]
in BPF programs. We could sort of enforce that by adding a custom Clippy lint and push Aya users to adhere to it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding stability - I think the variant parts annotated with #[repr(C)] are guaranteed to be stable. #[repr(Rust)] could change the layout any time, but I think it's fair to require the usage of #[repr(C)] in BPF programs. We could sort of enforce that by adding a custom Clippy lint and push Aya users to adhere to it.
I'm not worried about binary stability, but about generated DWARF stability. I don't think rustc gives any guarantees regarding DWARF representation of enums. Specifically, for the example at hand DWARF describes an equivalent of C structure below:
// repr A
union Adt {
u32 discriminant;
struct First First;
struct Second Second;
};
struct First {
u32 :32;
u32 a;
u32 b;
};
...
But an alternative representation would be correct as well:
// repr B
struct Adt {
u32 discriminant;
union {
struct First First;
struct Second Second;
}
};
struct First {
u32 a;
u32 b;
};
...
What you are trying to do with InitialOffset
is to convert repr A into repr B, which is recursive operation closely tied to the specifics of DWARF generation by rustc. Should rustc decide to use different DWARF representation the logic will need adjustment.
My worry is that we don't do the offset computation correctly, the BTF relocations done on variant parts won't work.
DWARF produced by rustc already contains correct offsets for data structure fields, what is it exactly that you suspect won't work?
Hypocritical of me given that I didn't get it right here (I'm very sorry, I'm still new to LLVM and compilers), but I really want to do it right.
Nothing to be sorry about, compilers are complicated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the late response, I was on PTO.
DWARF produced by rustc already contains correct offsets for data structure fields, what is it exactly that you suspect won't work?
I suspect BTF relocations to break if we don't match the offsets with the actual memory layout. The relocations of types done by BPF program loaders (libbpf or aya) have nothing to do with DWARF. They are done purely based on two BTF informations - the one tied to program and the one coming from the kernel.
Let's assume that the program comes with the following type:
pub enum Adt {
First { a: u32, b: i32 },
Second(u32, i32),
}
And accesses the b
field. There is no CO-RE macro in Rust yet, but let's imagine there is one:
fn prog(adt: Adt) {
match adt {
Adt::First(first) => {
// emits the @llvm.preserve.struct.access.index intrinsic call
// that emits a BTF relocation to field `b`
let b = bpf_core_read!(first, b);
// do something with `b`
}
_ => {},
}
}
We load such program to the kernel with some fresher definition of Adt
which moved the field b
:
pub enum Adt {
First { a: u32, new_field: u32, b: i32 },
Second(u32, i32),
}
What libbpf (or aya) will do when loading such program is comparing two BTFs and fixing up the offset of b
based on the kernel definition's BTF.
Therefore, my worry is that if we don't sort the offsets properly, relocation of the field b
won't work and program loaders with relocate to a wrong location.
Getting to the point how to solve it - in this thread #155783 (comment) @yonghong-song agrees with the idea of representing the variant part as a struct with a discriminator and an union. I gather that you are skeptical towards this solution, because of lack of Rust's DWARF stability guarantees. But I think it would solve the problem without having to fix up any offsets.
Given the following C code:
struct first {
unsigned int a;
int b;
};
struct second {
unsigned int a;
};
struct adt {
unsigned int discriminator;
union {
struct first first;
struct second second;
};
};
The BTF is:
#0: <VOID>
#1: <STRUCT> 'adt' sz:12 n:2
#00 'discriminator' off:0 --> [2]
#01 '<anon>' off:32 --> [3]
#2: <INT> 'unsigned int' bits:32 off:0
#3: <UNION> '<anon>' sz:8 n:2
#00 'first' off:0 --> [4]
#01 'second' off:0 --> [6]
#4: <STRUCT> 'first' sz:8 n:2
#00 'a' off:0 --> [2] // cumulative offset: 32
#01 'b' off:32 --> [5] // cumulative offset: 64
#5: <INT> 'int' bits:32 off:0 enc:signed
#6: <STRUCT> 'second' sz:4 n:1
#00 'a' off:0 --> [2] // cumulative offset: 32
#7: <VAR> 'x' kind:global-alloc --> [1]
#8: <DATASEC> '.data' sz:0 n:1
#00 off:0 sz:12 --> [7]
And that represents the memory layout perfectly, without adding any offsets to the struct with variants. Accessing the field b
works with cumulative offset.
I also think this solution is pretty universal and not specific only to Rust. I can check how other languages/frontends treat variant parts, but given that DWARF/LLVM DI has a separate reference for discriminator
, I would expect all languages to make use of it and to use the elements
only for variants that share the location.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect BTF relocations to break if we don't match the offsets with the actual memory layout. The relocations of types done by BPF program loaders (libbpf or aya) have nothing to do with DWARF. They are done purely based on two BTF informations - the one tied to program and the one coming from the kernel.
Direct translation of current DWARF representation gives same correct offsets:
union Adt {
u32 discriminant;
struct First First;
struct Second Second;
};
struct First {
u32 :32;
u32 a;
u32 b;
};
...
Given a pointer to First
a relocation accessing field First::b
will be a string 0:2
. In corresponding BTF this relocation will point to 3rd field of structure First
, namely b
, and use its offset, namely 8 bytes.
Getting to the point how to solve it - in this thread #155783 (comment) @yonghong-song agrees with the idea of representing the variant part as a struct with a discriminator and an union.
To do that you have to:
- lookup offset and size of the discriminator field;
- recursively traverse types used for variant parts, looking for a field that ends up at the discriminator offset and has discriminator size (nothing guarantees that discriminator field would be first member of the variant part type, or that it would be at nesting level 1);
- delete this field (if it exists)
- adjust offsets of all fields following field found on a previous step.
Note that types referenced from variant part might be used elsewhere, so when applying transformation you need to either guarantee consistent location for the discriminator field, or clone the type.
This procedure is both complicated and error prone.
Hence, I think that BTF backend should be dumb, and encode DWARF as-is. If you'd like ADTs to look better in DWARF, then rustc frontend should be adjusted to produce DWARF of different shape.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, that all makes sense! I didn't catch the fact that the fields of variant part structs (First
, Second
) already apply a correct offset to all fields. I removed the offset patching.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the reference, the BTF after my last push looks like:
[1] STRUCT 'MyEnum' size=12 vlen=1
'(anon)' type_id=3 bits_offset=0
[2] INT 'u32' size=4 bits_offset=0 nr_bits=32 encoding=(none)
[3] UNION '(anon)' size=12 vlen=3 <--- union
'(anon)' type_id=2 bits_offset=0 <-- discriminator
'First' type_id=4 bits_offset=0 <--- variants
'Second' type_id=6 bits_offset=0 <-/
[4] STRUCT 'First' size=12 vlen=2
'a' type_id=2 bits_offset=32 <--- offset from the discriminator
'b' type_id=5 bits_offset=64
[5] INT 'i32' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
[6] STRUCT 'Second' size=12 vlen=1
'__0' type_id=2 bits_offset=32 <--- offset from the discriminator
c3d8eb4
to
4672428
Compare
!llvm.dbg.cu = !{!27} | ||
|
||
; CHECK-BTF: [1] STRUCT 'MyEnum' size=12 vlen=1 | ||
; CHECK-BTF-NEXT: '(anon)' type_id=3 bits_offset=0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure whether this is an issue or not. Here, we have '(anon)' field name which essentially is a 'empty' name. If people goes to access 'First' struct from Struct 'MyEnum', the C access type will be like MyEnum.(anon).First
,
could this be a problem or there is rust way to handle this?
Also without 'name' for discriminator field, not sure how it will be used by BTF/kernel etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This anon is the variant union member - Rust enum
s are represented as
struct MyEnum {
union { // <--- both the member and the type are anonymous
u32, // <-- anonymous discriminator
Variant1,
...
VariantN,
}
}
In Rust you never access the discriminator nor the union directly. Instead you do something like:
fn use_myEnum(value: myEnum) {
match value {
MyEnum::First { a, b } => println!("first! a={a}, b={b}"),
MyEnum::Second(value) => println!("second! value={value}")
}
}
the compiler checks that your match
is exhaustive - meaning it covers all variants of MyEnum
; otherwise compilation fails.
Also without 'name' for discriminator field, not sure how it will be used by BTF/kernel etc.
It's probably not usable without a name, but I think we shouldn't invent a name for it here. The discriminator's (anon)
comes from Discriminator.NameOff = BDebug.addString(DDTy->getName());
on line 318 (I think) -- so the "solution" would be to change rustc to give it a name, which will automatically show up here.
Having said that, I think the name would be mostly useful for relocations, and we're a long way off from being able to apply BTF relocations to BPF generated from Rust.
The point of this PR is to translate the Rust DI to BTF as faithfully as possible, and that's how we get these anons.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even when it comes to applying BTF relocations in Rust, given that discriminators are in practically always first in the whole memory layout of the enum/union, we won't need to relocate the discriminators. I think focusing on relocating variants would be fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discriminator is not really used by btf, and the anonymous 'union' struct does not need a name. Okay make sense with no name for anywhere discriminator is placed. Thanks for explanation.
!llvm.dbg.cu = !{!27} | ||
|
||
; CHECK-BTF: [1] STRUCT 'MyEnum' size=12 vlen=1 | ||
; CHECK-BTF-NEXT: '(anon)' type_id=3 bits_offset=0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This anon is the variant union member - Rust enum
s are represented as
struct MyEnum {
union { // <--- both the member and the type are anonymous
u32, // <-- anonymous discriminator
Variant1,
...
VariantN,
}
}
In Rust you never access the discriminator nor the union directly. Instead you do something like:
fn use_myEnum(value: myEnum) {
match value {
MyEnum::First { a, b } => println!("first! a={a}, b={b}"),
MyEnum::Second(value) => println!("second! value={value}")
}
}
the compiler checks that your match
is exhaustive - meaning it covers all variants of MyEnum
; otherwise compilation fails.
Also without 'name' for discriminator field, not sure how it will be used by BTF/kernel etc.
It's probably not usable without a name, but I think we shouldn't invent a name for it here. The discriminator's (anon)
comes from Discriminator.NameOff = BDebug.addString(DDTy->getName());
on line 318 (I think) -- so the "solution" would be to change rustc to give it a name, which will automatically show up here.
Having said that, I think the name would be mostly useful for relocations, and we're a long way off from being able to apply BTF relocations to BPF generated from Rust.
The point of this PR is to translate the Rust DI to BTF as faithfully as possible, and that's how we get these anons.
For the reference: I checked BTF generation for kernel BPF selftests and there are (unsurprisingly) no differences with and w/o this patch. Should be ready to land once the remaining nits are addressed. |
8c1c38c
to
457009b
Compare
Variant part, represented by `DW_TAG_variant_part` is a structure with a discriminant and different variants, from which only one can be active and valid at the same time. The discriminant is the main difference between variant parts and unions represented by `DW_TAG_union` type. Variant parts are used by Rust enums, which look like: ```rust pub enum MyEnum { First { a: u32, b: i32 }, Second(u32), } ``` This type's debug info is the following `DICompositeType` with `DW_TAG_structure_type` tag: ```llvm !4 = !DICompositeType(tag: DW_TAG_structure_type, name: "MyEnum", scope: !2, file: !5, size: 96, align: 32, flags: DIFlagPublic, elements: !6, templateParams: !16, identifier: "faba668fd9f71e9b7cf3b9ac5e8b93cb") ``` With one element being also a `DICompositeType`, but with `DW_TAG_variant_part` tag: ```llvm !6 = !{!7} !7 = !DICompositeType(tag: DW_TAG_variant_part, scope: !4, file: !5, size: 96, align: 32, elements: !8, templateParams: !16, identifier: "e4aee046fc86d111657622fdcb8c42f7", discriminator: !21) ``` Which has a discriminator: ```llvm !21 = !DIDerivedType(tag: DW_TAG_member, scope: !4, file: !5, baseType: !13, size: 32, align: 32, flags: DIFlagArtificial) ``` Which then holds different variants as `DIDerivedType` elements with `DW_TAG_member` tag: ```llvm !8 = !{!9, !17} !9 = !DIDerivedType(tag: DW_TAG_member, name: "First", scope: !7, file: !5, baseType: !10, size: 96, align: 32, extraData: i32 0) !10 = !DICompositeType(tag: DW_TAG_structure_type, name: "First", scope: !4, file: !5, size: 96, align: 32, flags: DIFlagPublic, elements: !11, templateParams: !16, identifier: "cc7748c842e275452db4205b190c8ff7") !11 = !{!12, !14} !12 = !DIDerivedType(tag: DW_TAG_member, name: "a", scope: !10, file: !5, baseType: !13, size: 32, align: 32, offset: 32, flags: DIFlagPublic) !13 = !DIBasicType(name: "u32", size: 32, encoding: DW_ATE_unsigned) !14 = !DIDerivedType(tag: DW_TAG_member, name: "b", scope: !10, file: !5, baseType: !15, size: 32, align: 32, offset: 64, flags: DIFlagPublic) !15 = !DIBasicType(name: "i32", size: 32, encoding: DW_ATE_signed) !16 = !{} !17 = !DIDerivedType(tag: DW_TAG_member, name: "Second", scope: !7, file: !5, baseType: !18, size: 96, align: 32, extraData: i32 1) !18 = !DICompositeType(tag: DW_TAG_structure_type, name: "Second", scope: !4, file: !5, size: 96, align: 32, flags: DIFlagPublic, elements: !19, templateParams: !16, identifier: "a2094b1381f3082d504fbd0903aa7c06") !19 = !{!20} !20 = !DIDerivedType(tag: DW_TAG_member, name: "__0", scope: !18, file: !5, baseType: !13, size: 32, align: 32, offset: 32, flags: DIFlagPublic) ``` BPF backend was assuming that all the elements of any `DICompositeType` have tag `DW_TAG_member` and are instances of `DIDerivedType`. However, the single element of the outer composite type `!4` has tag `DW_TAG_variant_part` and is an instance of `DICompositeType`. The unconditional call of `cast<DIDerivedType>` on all elements was causing an assertion failure when any Rust code with enums was compiled to the BPF target. Fix that by: * Handling `DW_TAG_variant_part` in `visitStructType`. * Replacing unconditional call of `cast<DIDerivedType>` over `DICompositeType` elements with a `switch` statement, handling both `DW_TAG_member` and `DW_TAG_variant_part` and casting the element to an appropriate type (`DIDerivedType` or `DICompositeType`). To keep BTF simple and make BTF relocations correct, represent the discriminator as the first element and apply an offset to all elements. Fixes: llvm#155778
533d9a3
to
7f80672
Compare
Variant part, represented by
DW_TAG_variant_part
is a structure with a discriminant and different variants, from which only one can be active and valid at the same time. The discriminant is the main difference between variant parts and unions represented byDW_TAG_union
type.Variant parts are used by Rust enums, which look like:
This type's debug info is the following
DICompositeType
withDW_TAG_structure_type
tag:With one element being also a
DICompositeType
, but withDW_TAG_variant_part
tag:Which has a discriminator:
Which then holds different variants as
DIDerivedType
elements withDW_TAG_member
tag:BPF backend was assuming that all the elements of any
DICompositeType
have tagDW_TAG_member
and are instances ofDIDerivedType
. However, the single element of the outer composite type!4
has tagDW_TAG_variant_part
and is an instance ofDICompositeType
. The unconditional call ofcast<DIDerivedType>
on all elements was causing an assertion failure when any Rust code with enums was compiled to the BPF target.Fix that by:
DW_TAG_variant_part
invisitStructType
.cast<DIDerivedType>
overDICompositeType
elements with aswitch
statement, handling bothDW_TAG_member
andDW_TAG_variant_part
and casting the element to an appropriate type (DIDerivedType
orDICompositeType
).Fixes: #155778