-
Notifications
You must be signed in to change notification settings - Fork 1.6k
[RFC] Allow packed types to transitively contain aligned types #3718
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from 3 commits
8518173
3d44198
e77c6be
af3cc30
175d29f
a53aab1
5190399
a12370d
2ed92ef
ee3cd02
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,176 @@ | ||
| - Feature Name: `layout_packed_aligned` | ||
| - Start Date: 2024-10-24 | ||
| - RFC PR: [rust-lang/rfcs#3718](https://github.com/rust-lang/rfcs/pull/3718) | ||
| - Rust Issue: [rust-lang/rust#100743](https://github.com/rust-lang/rust/issues/100743) | ||
|
|
||
| # Summary | ||
| [summary]: #summary | ||
|
|
||
| This RFC makes it legal to have `#[repr(C)]` structs that are: | ||
| - Both packed and aligned. | ||
| - Packed, and transitively contains`#[repr(align)]` types. | ||
|
|
||
| It also introduces `#[repr(system)]` which is designed for interoperability with operating system APIs. | ||
| It has the same behavior as `#[repr(C)]` except on `*-pc-windows-gnu` targets where it uses the msvc layout | ||
| rules instead. | ||
|
|
||
| # Motivation | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can we explicitly document that a goal of this RFC is to make sure that existing uses of
Instead, deprecating |
||
| [motivation]: #motivation | ||
|
|
||
| This RFC enables the following struct definitions: | ||
|
|
||
| ```rs | ||
| #[repr(C, packed(2), align(4))] | ||
| struct Foo { // Alignment = 4, Size = 8 | ||
| a: u8, // Offset = 0 | ||
| b: u32, // Offset = 2 | ||
| } | ||
| ``` | ||
|
|
||
| This is commonly needed when Rust is being used to interop with existing C and C++ code bases, which may contain | ||
| unaligned types. For example in `clang` it is possible to create the following type definition, and there is | ||
| currently no easy way to create a matching Rust type: | ||
|
|
||
| ```cpp | ||
| struct __attribute__((packed, aligned(4))) MyStruct { | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It is somewhat confusing that the Rust example uses |
||
| uint8_t a; | ||
| uint32_t b; | ||
| }; | ||
PixelDust22 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| ``` | ||
|
|
||
| Currently `#[repr(packed(_))]` structs cannot transitively contain #[repr(align(_))] structs due to differing behavior between msvc and gcc/clang. | ||
PixelDust22 marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| However, in most cases, the user would expect `#[repr(C)]` to produce a struct layout matching the same type as defined by the current target. | ||
|
|
||
| # Guide-level explanation | ||
| [guide-level-explanation]: #guide-level-explanation | ||
|
|
||
| ## `#[repr(C)]` | ||
| When `align` and `packed` attributes exist on the same type, or when `packed` structs transitively contains `align` types, | ||
CAD97 marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| the resulting layout matches the target toolchain ABI. | ||
|
||
|
|
||
| For example, given: | ||
| ```c | ||
| #[repr(C, align(4))] | ||
| struct Foo(u8); | ||
| #[repr(C, packed(1))] | ||
| struct Bar(Foo); | ||
| ``` | ||
| `align_of::<Bar>()` would be 4 for `*-pc-windows-msvc` and 1 for everything else. | ||
|
||
|
|
||
|
|
||
| ## `#[repr(system)]` | ||
| When `align` and `packed` attributes exist on the same type, or when `packed` structs transitively contains `align` types, | ||
| the resulting layout matches the target OS ABI. | ||
|
|
||
| For example, given: | ||
| ```c | ||
| #[repr(C, align(4))] | ||
| struct Foo(u8); | ||
| #[repr(C, packed(1))] | ||
| struct Bar(Foo); | ||
| ``` | ||
| `align_of::<Bar>()` would be 4 for `*-pc-windows-msvc` and `*-pc-windows-gnu`. It would be 1 for everything else. | ||
PixelDust22 marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
|
|
||
|
|
||
| # Reference-level explanation | ||
| [reference-level-explanation]: #reference-level-explanation | ||
|
|
||
| In the following paragraphs, "Decreasing M to N" means: | ||
| ``` | ||
| if M > N { | ||
| M = n | ||
| } | ||
| ``` | ||
|
|
||
| "Increasing M to N" means: | ||
| ``` | ||
| if M < N { | ||
| M = N | ||
| } | ||
| ``` | ||
|
|
||
|
|
||
| `#[repr(align(N))]` increases the base alignment of a type to be N. | ||
|
|
||
| `#[repr(packed(M))]` decreases the alignment of the struct fields to be M. Because the base alignment of the type | ||
| is defined as the maximum of the alignment for any fields, this also has the indirect result of decreasing the base | ||
| alignment of the type to be M. | ||
|
|
||
| When the align and packed modifiers are applied on the same type as `#[repr(align(N), packed(M))]`, | ||
| the alignment of the struct fields are decreased to be M. Then, the base alignment of the type is | ||
| increased to be N. | ||
|
|
||
| When a `#[repr(packed(M))]` struct transitively contains a field with `#[repr(align(N))]` type, | ||
PixelDust22 marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| - The field is first `pad_to_align`. Then, the field is added to the struct with alignment decreased to M. The packing requirement overrides the alignment requirement. (GCC, `#[repr(Rust)]`, `#[repr(C)]` on gnu targets, `#[repr(system)]` on non-windows targets) | ||
|
||
| - The field is added to the struct with alignment increased to N. The alignment requirement overrides the packing requirement. (MSVC, `#[repr(C)]` on msvc targets, `#[repr(system)]` on windows targets) | ||
|
||
|
|
||
| # Drawbacks | ||
| [drawbacks]: #drawbacks | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Due to this, this RFC will actually change the layout of some types that are currently accepted on stable, on MSVC targets. That should be discussed as a drawback. |
||
|
|
||
| Historically the meaning of `#[repr(C)]` has been somewhat ambiguous. When someone puts `#[repr(C)]` on their struct, their intention could be one of three things: | ||
|
||
| 1. Having a target-independent and stable representation of the data structure for storage or transmission. | ||
| 2. FFI with C and C++ libraries compiled for the same target. | ||
| 3. Interoperability with operating system APIs. | ||
|
|
||
| Today, `#[repr(C)]` is being used for all 3 scenarios because the user cannot create a `#[repr(C)]` struct with ambiguous layout between targets. However, this also means | ||
| that there exists some C layouts that cannot be specified using `#[repr(C)]`. | ||
|
|
||
| This RFC addresses use case 2 with `#[repr(C)]` and use case 3 with `#[repr(system)]`. For use case 1, people will have to seek alternative solutions such as `crABI` or | ||
| protobuf. However, it could be a footgun if people continue to use `#[repr(C)]` for use case 1. | ||
|
|
||
|
|
||
|
|
||
| # Rationale and alternatives | ||
| [rationale-and-alternatives]: #rationale-and-alternatives | ||
|
|
||
| This RFC clarifies that: | ||
| - `repr(C)` must interoperate with the C compiler for the target. | ||
| - `repr(system)` must interoperate with the operating system APIs for the target. | ||
| - Similiar to Clang, `repr(C)` does not guarantee consistent layout between targets. | ||
|
|
||
| Alternatively, we can also create syntax that allows the user to specify exactly which semantic to use when packed structs transitively contains aligned fields. | ||
| For example, a new attribute: #[repr(align_override_packed(N))] that can be used when the behavior of the child overriding the parent alignment is desired. | ||
|
|
||
| #[repr(align(N))] #[repr(packed)] can be used together to get the opposite behavior, parent/outer alignment wins. | ||
|
|
||
| Explicitly specifying the pack/align semantic has the drawback of complicating FFI. For example, you might need two different definition files depending on the target. | ||
|
|
||
| Therefore, a stable layout across compilation target should be relegated as future work. | ||
|
|
||
|
|
||
|
|
||
|
|
||
| # Prior art | ||
| [prior-art]: #prior-art | ||
|
|
||
| Clang matches the Windows ABI for `x86_64-pc-windows-msvc` and matches the GCC ABI for `x86_64-pc-windows-gnu`. | ||
|
|
||
| MinGW always uses the GCC ABI. | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is there prior art for a compiler that can lay out types both using the Windows ABI and the GCC ABI for code within a single target? If yes, how are they distinguishing the two? If no, why does Rust need this ability?
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. gcc apparently supports that by using the
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So that would correspond tom in Rust
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For a more full parallel to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
|
|
||
| We already have both `C` and `system` [calling conventions](https://doc.rust-lang.org/beta/nomicon/ffi.html#foreign-calling-conventions) | ||
| to support differing behavior on `x86_windows` and `x86_64_windows`. | ||
|
|
||
|
|
||
| This issue was introduced in the [original implementation](https://github.com/rust-lang/rust/issues/33158) of `#[repr(packed(N))]` and have since underwent extensive community discussions: | ||
| - [#[repr(align(N))] fields not allowed in #[repr(packed(M>=N))] structs](https://github.com/rust-lang/rust/issues/100743) | ||
| - [repr(C) does not always match the current target's C toolchain (when that target is windows-msvc)](https://github.com/rust-lang/unsafe-code-guidelines/issues/521) | ||
| - [repr(C) is unsound on MSVC targets](https://github.com/rust-lang/rust/issues/81996) | ||
| - [E0587 error on packed and aligned structures from C](https://github.com/rust-lang/rust/issues/59154) | ||
| - [E0587 error on packed and aligned structures from C (bindgen)](https://github.com/rust-lang/rust-bindgen/issues/1538) | ||
| - [Support for both packed and aligned (in repr(C)](https://github.com/rust-lang/rust/issues/118018) | ||
| - [bindgen wanted features & bugfixes (Rust-for-Linux)](https://github.com/Rust-for-Linux/linux/issues/353) | ||
| - [packed type cannot transitively contain a #[repr(align)] type](https://github.com/rust-lang/rust-bindgen/issues/2179) | ||
| - [structure layout using __aligned__ attribute is incorrect](https://github.com/rust-lang/rust-bindgen/issues/867) | ||
|
|
||
|
|
||
| # Unresolved questions | ||
| [unresolved-questions]: #unresolved-questions | ||
|
|
||
| None for now. | ||
|
|
||
|
|
||
| # Future possibilities | ||
| [future-possibilities]: #future-possibilities | ||
|
|
||
| People intending for a stable struct layout consistent across targets would be directed to use `crABI`. | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it's be worth splitting this out.
I'm a big fan of splitting
repr(linear)vsrepr(C)(which I think this is spelling asrepr(C)vsrepr(system)) to have the distinction between "the layout you get with https://doc.rust-lang.org/beta/std/alloc/struct.Layout.html#method.extend" and "whatever weird layout your C compiler uses". That distinction would be really nice for making intent clearer, since today you get "you can't do that in C" warnings sometimes just because you usedrepr(C)to have a predictable layout for your Rust-only code.So I'd kinda like to consider that separately from any new packed-related stuff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, the only difference between
#[repr(C)]and#[repr(system)]by this RFC is that on pc-windows-gnu#[repr(system)]is the MSVC layout while#[repr(C)]is the GCC layout. On all other targets#[repr(system)]and#[repr(C)]are identical (per this RFC).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think
repr(linear)is independently useful and we should have all 3. that said, I am concerned that currentrepr(C)code often means "I want stable linear layout" rather than "I want whatever weirdness the C compiler decides to use", so I think deprecatingrepr(C)and replacing it withrepr(linear),repr(bikeshed_other_C)andrepr(system)is worth considering.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, this RFC proposes a different distinction between
repr(C)andrepr(system)than what has been previously discussed in other threads.Since the distinction between the two layouts we have here is Windows-only, I wonder if it should be some Windows-only name, like
repr(msvc)or so? Is there a good reason to even make both of them available on all targets -- effectively exporting a Windows-only complication to other, saner platforms?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason is that this is also how the
extern "system"function ABI string works. Code usesextern "system"to link to libraries which use__stdcallon Windows platforms, and in the same way code would use#[repr(system)]for linking to a dylib which provides builds with only the MSVC toolchain for Windows targets.I don't necessarily endorse this option, but it is logically consistent with how Rust already uses
"system"as an ABI string.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, yes, that's correct.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should introduce
repr(system)here. A general mechanism with a generic name like that doesn't seem appropriate for something needed only on thewindows-gnutargets for using the MSVC ABI. I'd suggest that either this should be removed from this RFC or it should be renamed to something likerepr(MSVC)orrepr(windows).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also had similar trouble with AIX, though it may seem like that can in the end be resolved without changing the layout algorithm... it's not entirely clear yet.
But more generally, we could totally encounter a target that specifies its C structs to have some funny layout rules, e.g. requiring certain things to be over-aligned. And for
repr(C)on ZST, there's really nothing sensible we can do since the C ABI cannot express such types. AFAIK C compilers will typically ensure every struct has size at least 1; should we also adopt that rule and if not, why not?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we know for certain that we will never get another target like this? As in, a target meant to interoperate with a non-standard C compiler for the system, with a different ABI than the standard system C compiler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suppose that we did. Would it actually be useful to have a "system" repr that could be used for both cases? I would expect most cases of such a repr would be closely tied to the OS.