-
Notifications
You must be signed in to change notification settings - Fork 13.8k
FCW for repr(C) enums whose discriminant values do not fit into a c_int #147017
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
This PR modifies cc @jieyouxu |
r? @davidtwco rustbot has assigned @davidtwco. Use |
b8688d6
to
ea5e355
Compare
@future_incompatible = FutureIncompatibleInfo { | ||
reason: FutureIncompatibilityReason::FutureReleaseError, | ||
reference: "issue #124403 <https://github.com/rust-lang/rust/issues/124403>", | ||
report_in_deps: false, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't mind making this show up in dependencies immediately since it should be rather rare; let's see what the lang team thinks.
This comment has been minimized.
This comment has been minimized.
ea5e355
to
8d8188e
Compare
This comment has been minimized.
This comment has been minimized.
8d8188e
to
ca6a04d
Compare
I think we should crater it (presumably dialed up to deny-by-default) before we go with any |
That's still ignored in dependencies, we'd want a hard error. And sadly converting between a lint and a hard error is a huge pain in the neck ever since we got that (by now apparently unmaintained) translatable diagnostics infrastructure. :/ |
ca6a04d
to
491bfd6
Compare
Ah, I can just not use translatable diagnostics. :) @bors try |
This comment has been minimized.
This comment has been minimized.
FCW for repr(C) enums whose discriminant values do not fit into a c_int
Yes, if new code is added using the translatable diagnostics infrastructure that's fine but also we should just bypass it the moment it is an inconvenience. |
@craterbot check |
👌 Experiment ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implementation looks good to me. I had looked at the previous PR but didn't have any suggestions as to how to resolve the issues with that approach, so this seems like the best we can do.
This did uncover an interesting case... enums like this where all discriminants fit in a But unless we have evidence that there is actually non-portability with such enums, there is probably no good reason to lint on that. |
Wonderful, thank you for making the crater run and digging that up. Yes, we probably want to remain compatible with that, as the way Standard C is written suggests strongly that the flexibility between using Is there a good way to do that? |
Yeah it's definitely possible to extend the logic this PR added, though the logic becomes more gnarly... (Layout code already kind of has that logic but I don't know if we can reuse it.) |
FWIW enums like that will already trigger the overflowing_literals lint on 32bit systems. But that doesn't show up in dependencies so it is easy to miss... |
789c382
to
470702b
Compare
@bors try |
FCW for repr(C) enums whose discriminant values do not fit into a c_int
This comment has been minimized.
This comment has been minimized.
👌 Experiment ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more |
I began this post trying to point out some subtle behavior that is actually fine albeit in a non-obvious way, and then realized there may actually be a problem here... There's something non-obvious going on with enums like this one: #[repr(C)]
enum E {
A = 0,
B = 0xffffffff,
}
I'm not entirely sure what, if anything, we should do here. This is not a new issue and so far I didn't see any reports about it. When compiling the code for the affected target, a deny-by-default lint is triggered saying that OTOH this makes me question whether it is really a good idea to exempt enums where all discriminant values fit into |
🚧 Experiment ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more |
🎉 Experiment
Footnotes
|
Those both look like legit cases where the lint should indeed fire. |
@rust-lang/lang so to summarize, the questions for you are:
|
re: 2 (FCW immediately or not?): Unless we choose the most-minimal-impact version of this, I think we should slow-roll this instead of jumping to warning in deps to avoid a repeat of the "lol this FCW hits the windows crate" situation, as it seems likely to be something that is quickly found and fixed in deps. re: 3 (enums that hypothetically fit into uints) and this:
Yes, despite saying that allowing |
470702b
to
aa78e00
Compare
aa78e00
to
929a236
Compare
Context: #146504.
The current behavior repr(C) enums is as follows:
isize
Unfortunately, this doesn't always match what C compilers do. In particular, MSVC seems to always give enums a size of 4 bytes, whereas the algorithm above will give enums a size of up to 8 bytes on 64bit targets. Here's an example enum affected by this:
If we look at the C standard, then up until C20, there was no official support enums without an explicit underlying type and with discriminants that do not fit an
int
. With C23, this has changed: now enums have to grow automatically if there is an integer type that can hold all their discriminants. MSVC does not implement this part of C23.Furthermore, Rust fundamentally cannot implement this (without major changes)! Enum discriminants work fundamentally different in Rust and C:
isize
. So from the outset we interpret 9223372036854775807 as an isize literal and never give it a chance to be stored in a bigger type. If the discriminant is given as a literal without type annotation, it gets wrapped implicitly with a warning; otherwise the user has to writeas isize
explicitly and thus trigger the wrapping. Later, we can then decide to make the tag that stores the discriminant smaller than the discriminant type if all discriminant values fit into a smaller type, but those values have allready all been made to fit anisize
so nothing bigger thanisize
could ever come out of this. That makes the behavior of 32bit GCC impossible for us to match.long long
, and then uses that to compute the size of the enum (at least that's what C23 says should happen and GCC does this correctly).Realistically I think the best we can do is to not attempt to support C23 enums, and to require repr(C) enums to satisfy the C20 requirements: all discriminants must fit into a c_int. So that's what this PR implements, by adding a FCW for enums with discriminants that do not fit into
c_int
. As a slight extension, we do not lint enums where all discriminants fit into ac_uint
(i.e.unsigned int
): while C20 does (in my reading) not allow this, and C23 does not prescribe the size of such an enum, this seems to behave consistently across compilers (giving the enum the size of anunsigned int
). IOW, the lint fires whenever our layout algorithm would make the enum larger than anint
, irrespective of whether we pick a signed or unsigned discriminant. This extension was added because crater found multiple cases of such enums across the ecosystem.Note that it is impossible to trigger this FCW on targets where isize and c_int are the same size (i.e., the typical 32bit target): since we interpret discriminant values as isize, by the time we look at them, they have already been wrapped. However, we have an existing lint (overflowing_literals) that should notify people when this kind of wrapping occurs implicitly. Also, 64bit targets are much more common. On the other hand, even on 64bit targets it is possible to fall into the same trap by writing a literal that is so big that it does not fit into isize, gets wrapped (triggering overflowing_literals), and the wrapped value fits into c_int. Furthermore, overflowing_literals is just a lint, so if it occurs in a dependency you won't notice. (Arguably there is also a more general problem here: for literals of type
usize
/isize
, it is fairly easy to write code that only triggersoverflowing_literals
on 32bit targets, and to never see that lint if one develops on a 64bit target.)Specifically, the above example triggers the FCW on 64bit targets, but on 32bit targets we get this err-by-default lint instead (which will be hidden if it occurs in a dependency):
Also see the tests added by this PR.
This isn't perfect, but so far I don't think I have seen a better option. In #146504 I tried adjusting our enum logic to make the size of the example enum above actually match what C compilers do, but that's a massive breaking change since we have to change the expected type of the discriminant expression from
isize
toi64
or eveni128
-- so that seems like a no-go. To improve the lint we could analyze things on the HIR level and specifically catch "repr(C) enums with discriminants defined as literals that are too big", but that would have to be on top of the lint in this PR I think since we'd still want to also always check the actually evaluated value (which we can't always determined on the HIR level).Cc @workingjubilee @CAD97