-
Notifications
You must be signed in to change notification settings - Fork 344
RMT: Various cleanups, small API enhancements #4302
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Actually, hold on - the enumset usage in 964418a seems to blow up some closure sizes, which in turn won't be inlined, leading to bad codegen... |
52ee2cb
to
efd702a
Compare
Should be good now. What do you think?
What happened is that I had rewritten #[inline]
pub fn clear_tx_interrupts(&self) {
let rmt = crate::peripherals::RMT::regs();
rmt.int_clr().write(|w| {
w.ch_tx_end(self.ch_idx as u8).set_bit();
w.ch_tx_err(self.ch_idx as u8).set_bit();
w.ch_tx_loop(self.ch_idx as u8).set_bit();
w.ch_tx_thr_event(self.ch_idx as u8).set_bit()
});
} to #[inline(always)]
pub fn clear_tx_interrupts(&self, events: impl Into<EnumSet<Event>>) {
let rmt = crate::peripherals::RMT::regs();
let events = events.into();
let ch_idx = self.ch_idx as u8;
rmt.int_clr().write(|w| {
if events.contains(Event::End) {
w.ch_tx_end(ch_idx).set_bit();
}
if events.contains(Event::Error) {
w.ch_tx_err(ch_idx).set_bit();
}
if events.contains(Event::LoopCount) {
w.ch_tx_loop(ch_idx).set_bit();
}
if events.contains(Event::Threshold) {
w.ch_tx_thr_event(ch_idx).set_bit();
}
});
} which
In the end, the compiler chose not to inline the closure anymore, such that the bounds checks (one for each bit!) remained in the binary and the code is not optimized as part of the caller where the I've now moved the cast to inside the closure, and also added |
Could you please move the changelog/migration guide to the next version? |
regression from esp-rs#4174 (arguable, before that, the code was just wrong)
this seems slighly cleaner, since it more clearly indicates how the two are related
removing the redundant prefix this isn't breaking API, since both types are private
the previous name was quite unwieldy, and this also better shows the symmetry between RxTransaction and TxTransaction
Note that it's important how the closures for PAC register `modify` and `write` calls are layed out. Compare ```rust pub fn clear_tx_interrupts(&self) { let rmt = crate::peripherals::RMT::regs(); rmt.int_clr().write(|w| { let ch_idx = self.ch_idx as u8; w.ch_tx_end(ch_idx).set_bit(); w.ch_tx_err(ch_idx).set_bit(); w.ch_tx_loop(ch_idx).set_bit(); w.ch_tx_thr_event(ch_idx).set_bit() }); } ``` and ```rust pub fn clear_tx_interrupts(&self) { let rmt = crate::peripherals::RMT::regs(); let ch_idx = self.ch_idx as u8; rmt.int_clr().write(|w| { w.ch_tx_end(ch_idx).set_bit(); w.ch_tx_err(ch_idx).set_bit(); w.ch_tx_loop(ch_idx).set_bit(); w.ch_tx_thr_event(ch_idx).set_bit() }); } ``` In the first example, the cast from `self.ch_idx: ChannelIndex` to `u8` happens in the closure, whereas in the second case, the closure will be built without the information about the restricted range of `ChannelIndex` and assume `ch_idx` to be any `u8`. That leads to bounds check being present (from the PAC register writer), which greatly increases the closure size, and might push it above the compiler's threshold for inlining. However, we generally rely quite a bit on inlining with these methods to generate efficient code here and in similar methods.
In practice, this happened anyway, but better enforce this: - many of these methods are designed to allow for optimizations after inling (e.g. removing conditionals due to compile-time known values) - if they end up being not inlined, the place_rmt_driver_in_ram feature doesn't take full effect
Since it is Copy. We expect all of this to be inlined, so there shouldn't be a difference in the final code. This should however simplify the compiler's job a tiny bit and more importantly prevent pointer indirection and spilling to stack if one of these methods ends up being not inlined for whatever reason.
more consistent with the new LoopMode type, and avoids suggesting that this method might do anything like enabling an interrupt (which it doesn't)
instead of permanently consuming them. This requires quite a bit of refactoring since we must not call pin.into() before verifying all inputs in order to be able to return the original pin type on error.
instead of consuming it, which gives no way to recover/recreate the channel
By adding an EnumSet argument. This nicely mirrors the listen/unlisten functions, which take the same type of argument. It also helps to avoid proliferation of a large number of methods when using more interrupts (such as the loopcount interrupt, which I intend to do in a later PR). There should be no performance negative, since these methods are always inlined with compile-time known arguments, such that the compiler can remove all conditionals.
This make the code slightly more concise and obviously branch-free (although the compiler seems to make that optimization on it's own, as well). This code remains equivalent, because int_clr has only a write-to-clear bits and the register writer will be initialized to all zeroes before the closure is invoked.
- move configure_tx, configure_rx out of Channel again so that they don't end up being duplicated via monomorphization of Dm = Blocking and Async in spite of being virtually identical. It's probably a bit of an edge case to have both Blocking and Async channels in one binary (the HIL test do, though), so it's not really an important optimization, but it still seems cleaner this way. - both methods take the configuration by reference instead of by value (it ended up being passed on the stack anyway, so this doesn't really change the generated code, but allows the caller to re-use the configuration)
Thank you for your contribution!
We appreciate the time and effort you've put into this pull request.
To help us review it efficiently, please ensure you've gone through the following checklist:
Submission Checklist 📝
cargo xtask fmt-packages
command to ensure that all changed code is formatted correctly.CHANGELOG.md
in the proper section.Extra:
Pull Request Details 📖
Assorted small-ish refactorings (sorry for the large number of commits, but they should all be fairly self-contained mechanical changes):
most commits are purely internal, to make things more consistent, easier to reason about, and remove slight inefficiencies
ensure that all methods that take channels by value also return them under all circumstances (instead of consuming them permanently on error, which some of them did)
rename
SingleShotTxTransaction
toTxTransaction
since the previous name was quite unwieldy, and this also better shows the symmetry betweenRxTransaction
andTxTransaction
Testing
HIL tests.