-
Notifications
You must be signed in to change notification settings - Fork 14k
Make volatile opportunistically relaxed #139299
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment has been minimized.
This comment has been minimized.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
|
@rust-lang/opsem This came out of a discussion about volatile not being usable for MMIO in some cases, where the (virtualized) implementation does not support MMIO through all instructions, but only some basic ones. What do you think about this, guaranteeing that volatile accesses turn into specific instructions for specific sized? (ignoring the details of the implementation) |
|
Sorry, could someone explain the context for non-embedded people here?
This example looks like the output of pwgen to my eyes. What is it supposed to tell me? :)
Why is it not possible to choose...? You seem to be doing quite well choosing it for this PR. confused |
|
FWIW there have been many hundreds of comments in various issues around "volatile" on the UCG and opsem issue tracker and Zulip. We'd be happy to work with someone who wants to put in the time of properly re-doing our entire volatile story, but unfortunately we don't have the resources to do it entirely by ourselves. "It's basically inline assembly" is IMO the most promising approach, though it is unfortunate that apparently we cannot rely on LLVM doing that. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand the point of these macro contraptions and why the docs are now in a separate file. Can't we have a single function with a cfg_match! inside? Why does the function itself need to be emitted by a macro?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably what resulted in this is that naive use of cfg_match!'s current implementation doesn't behave great in an expression position that wants to produce a value. That, or being first written with a cfg_if! that requires the arms to be made up of items.
You can write cfg_match! {{ … }} with two brackets to get it to consistently work as an expression block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just found this way more readable. And macros and doc comments don't interact well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's quite hard to read to have the comment out-of-line for one of many functions. And the macro shouldn't have anything to do with the doc comment in the first place. The macro should generate the body of {read,write}_volatile, not the entire function.
This function isn't so special that it needs to be so different from everything else in core::ptr.
library/core/src/ptr/volatile.rs
Outdated
| dest = in(reg) dst, | ||
| ), | ||
| 4 => asm!( | ||
| "str {val}, [{dest}]", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it meaningful that str etc are lower-case but LDR etc are upper-case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No
|
@RalfJung the way this came up in discussion again on the community discord is that someone was having problems with their non-embedded x86-64 OS in KVM (virtualization). KVM apparently only allows MMIO to happen with a specific subset of instructions, and LLVM sometimes codegened the volatile with other instructions (merging it with operations for example) such that while the write was still present, KVM no longer understood it. |
|
The PR does not change anything for x86 though?
I think ideally this would be fixed in LLVM or KVM. The two do not agree on how to do MMIO. We can work around that on the Rust side, but not without at least trying to fix the actual issue. Has this problem been reported with either of them?
|
I do not think we should guarantee anything. In the referenced discussion I encouraged a PR such as this (though I'll admit I didn't anticipate so much diff) as a pure quality-of-implementation concern. No changes to any documentation. But in addition, since people like to read the standard library and take its internal comments as normative, we will need to be very clear in the implementation here to "document" with internal comments that any particular codegen is not guaranteed. |
It uses cfg to select arm, thumb, 32-bit. Which is a wider range that thumv6m.
Many people expect volatile not to merge anything. This does.
Yes, this is a first step to see if people even want to see a change like this. Rather than touching the precious x86, this just touches an embedded platform.
Probably! But I will not be the one to do that. From my POV LLVM is great at optimizing code, but it seems that all the different passes often forget things. So saying "well fix it in LLVM," means that LLVM should not forget things throughout its pipeline. Which would be great! But history has suggested that is a hard problem. |
What is being merged? With what? Where? Again, your example is entirely unreadable to me. So please explain whatever you think the example should demonstrate.
I don't think we should merge anything here without at least filing an issue on the LLVM side. And if LLVM considers this not-a-bug, we'll have to determine if we are ready to carry our own implementation of this ~forever. Working around upstream behavior is a last resort, not the first thing we should try. |
|
Once again, history suggests that LLVM is not that reliable at carrying information through its pipeline. So while they can probably fix this specific example, it is questionable at best that they can prove that for all operations they will never merge instructions. If you take a look at this example here: While I am not an expert on x86 I believe what is happening is the following:
This is what the average embedded dev would expect from a volatile read. It just reads the location in memory, then does other stuff. No funny business. While it is now Another funny idea is to internally tell llvm it is an atomic relaxed volatile load :3c (which could also quell concerns about people trying to do atomic things with volatile, which is still very common because of cargo-culting no matter how much you tell people not to). |
Yes, I want all our volatile accesses to compile to atomic accesses in LLVM. However, sadly we allow volatile accesses on all types, so we'll have to figure something out for too-large and misaligned accesses. So if the issue you are having can be fixed by marking |
The goal is to ensure LLVM doesn't do anything silly (:3) with the loads.
This comment has been minimized.
This comment has been minimized.
CAD97
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With my opsem hat, I'm with Ralf here: making Rust's "volatile" mean "volatile atomic" for "appropriately sized" accesses better matches what developers generally expect when they reach for volatile. But also that if LLVM isn't preserving the exact read size of "appropriately sized" volatile reads, that this is an LLVM bug and we should attempt to fix it in LLVM before working around it, or at least link to and track an LLVM issue tracking a fix so that we can remove the workaround once it isn't needed anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file is now redundant again and should be removed.
library/core/src/ptr/volatile.rs
Outdated
| is_zst: bool = T::IS_ZST, | ||
| ) => crate::ub_checks::maybe_is_aligned_and_not_null(addr, align, is_zst) | ||
| ); | ||
| cfg_match! { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course ideally all of this magic would happen in the intrinsic / in LLVM. If we're going to implement volatile semantics in lib code for any appreciable amount of time, though, it should probably be done in sys PAL modules rather than inline due to the quantity of different platforms which will likely want to provide asm! volatile semantics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sys is a std thing, we can't use it here. And anyway it's for abstracting OS/platform differences, not CPU architecture differences.
|
The only libs reviewer here is @Noratrieb. There is no libs reviewer assigned, and libs hasn't been contacted generally, only opsem. I very strongly prefer that questions about the feasibility of carrying a library implementation such as this (as well as guidance on how to implement this well) be answered by or with the input of libs maintainers. @rust-lang/libs |
TODO: atomic_volatile_relaxed
Copied the code from here: https://github.com/llvm/llvm-project/blob/24dfcc0c024f9ab8ba61c0994513f57e882961fc/llvm/lib/Frontend/Atomic/Atomic.cpp#L31-L32 Did not implement it for GCC or craneline.
|
Ok here is an alternative with volatile atomics. I think this is probably better. |
|
Some changes occurred in compiler/rustc_codegen_ssa Some changes occurred to the intrinsics. Make sure the CTFE / Miri interpreter |
|
The job Click to see the possible cause of the failure (guessed by this bot) |
|
I don't believe this change belong in the standard library. It would be very surprising for With that said I did review the Discord discussion and I don't think changing the I think this is a good opportunity for a MMIO crate which explicitly guarantees that it emits KVM-compatible instructions on supported architectures. |
|
I think opportunistically making volatile to be relaxed atomic when it is able to would be better. We are allowed to be better than Clang. And I doubt LLVM would accept making volatile atomic by default. It would be much clearer to the many people who ask what volatile means wrt atomics, and people do and will continue to write code assuming volatile means something in terms of synchronization, so it would mean their code is less broken. |
|
I don't think this actually makes things better, it just encourages people to rely on stronger guarantees than what we actually provide. The definition of |
|
FWIW, my atomic-maybe-uninit crate uses "volatile" inline assembly and provides at least atomic load/store on almost all CPU architectures supported by Rust. It was originally created for use with crossbeam, but I guess it can be used for this purpose as well, since relaxed atomic load stores up to at least a pointer width use normal load/store instructions. The functionality this crate provides is something that is considered to be impossible to implement without inline assembly at this time, and since there are so few instruction choices reasonably available for within-register-size integer relaxed atomic load/store, there should be no problem in providing more specific implementation guarantees, including instructions that are actually used. |
Indeed ideally this should be done by fixing our definition from "whatever C does" to something actually well-defined. |
This is mainly to test the waters to see if this is something that is even desired.
What this does
For a single instruction set, make the power of two sizes up to a register width emit single instructions, as one would expect, in asm. I chose thumb because I work on thumbv6m. But it is not possible to choose that specifically.
For other weird sizes that are basically meaningless in the context of MMIO it is the status quo because there's really nothing more sane that can be done.
I would appreciate if someone could review the asm because while I am an embedded dev, I don't write asm that much.
The murky past
What is volatile? Who knows! The C standard definitely says something. What is that something is not quite clear. It's all about the Vibes.
Well, what do compilers do? So users aren't unhappy they general do something that matches the vibes. But you can coax them to not in weird edge cases. One that was shown off recently on the Rust Community Discord is this
https://godbolt.org/z/WxP14vqPa
Well the vibes aren't great.
The controversial present
People like to debate about volatile all the time. What does it do, what do they want it to do, what does it mean for opsem? Well these are all great questions I will not answer.
Some of the largest users of volatile are embedded developers (disclosure: I am an embedded dev :ferrisclueless:). It should almost exclusively be used for MMIO if the devs are being good. In those cases it is generally desired that the assembly output is a single load.
Unfortunately libstd make the (imo) mistake that things larger than a register (someone will probably say "erm on X arch you can do MMIO for other things, but that's not the point) can be read with
read_volatile. But for sizes we can just turn it into a quality of implementation issue. Emit a load. Emit a store.The shiny future
Hopefully people can fill in their favorite arches so Rust's volatile loads and stores are the best they can be. I would also like it if volatile loads and stores were relaxed operations so that people's broken code is slightly less broken. But that's a future discussion.
r? wg-embedded-arm