Skip to content

Conversation

wisp3rwind
Copy link
Contributor

Thank you for your contribution!

We appreciate the time and effort you've put into this pull request.
To help us review it efficiently, please ensure you've gone through the following checklist:

Submission Checklist 📝

  • I have updated existing examples or added new ones (if applicable).
  • I have used cargo xtask fmt-packages command to ensure that all changed code is formatted correctly.
  • My changes were added to the CHANGELOG.md in the proper section.
  • I have added necessary changes to user code to the latest Migration Guide.
  • My changes are in accordance to the esp-rs developer guidelines

Extra:

Pull Request Details 📖

Assorted small-ish refactorings (sorry for the large number of commits, but they should all be fairly self-contained mechanical changes):

  • most commits are purely internal, to make things more consistent, easier to reason about, and remove slight inefficiencies

  • ensure that all methods that take channels by value also return them under all circumstances (instead of consuming them permanently on error, which some of them did)

  • rename SingleShotTxTransaction to TxTransaction since the previous name was quite unwieldy, and this also better shows the symmetry between RxTransaction and TxTransaction

Testing

HIL tests.

@wisp3rwind wisp3rwind mentioned this pull request Oct 9, 2025
12 tasks
@wisp3rwind wisp3rwind marked this pull request as ready for review October 9, 2025 09:50
@wisp3rwind
Copy link
Contributor Author

Actually, hold on - the enumset usage in 964418a seems to blow up some closure sizes, which in turn won't be inlined, leading to bad codegen...

@wisp3rwind wisp3rwind marked this pull request as draft October 9, 2025 10:39
@wisp3rwind wisp3rwind force-pushed the rmt-misc branch 2 times, most recently from 52ee2cb to efd702a Compare October 10, 2025 17:02
@wisp3rwind
Copy link
Contributor Author

Should be good now. What do you think?


Actually, hold on - the enumset usage in 964418a seems to blow up some closure sizes, which in turn won't be inlined, leading to bad codegen...

What happened is that I had rewritten

#[inline]
pub fn clear_tx_interrupts(&self) {
    let rmt = crate::peripherals::RMT::regs();

    rmt.int_clr().write(|w| {
        w.ch_tx_end(self.ch_idx as u8).set_bit();
        w.ch_tx_err(self.ch_idx as u8).set_bit();
        w.ch_tx_loop(self.ch_idx as u8).set_bit();
        w.ch_tx_thr_event(self.ch_idx as u8).set_bit()
    });
}

to

#[inline(always)]
pub fn clear_tx_interrupts(&self, events: impl Into<EnumSet<Event>>) {
    let rmt = crate::peripherals::RMT::regs();
    let events = events.into();
    let ch_idx = self.ch_idx as u8;

    rmt.int_clr().write(|w| {
        if events.contains(Event::End) {
            w.ch_tx_end(ch_idx).set_bit();
        }
        if events.contains(Event::Error) {
            w.ch_tx_err(ch_idx).set_bit();
        }
        if events.contains(Event::LoopCount) {
            w.ch_tx_loop(ch_idx).set_bit();
        }
        if events.contains(Event::Threshold) {
            w.ch_tx_thr_event(ch_idx).set_bit();
        }
    });
}

which

  1. obviously increased the amount of code in the closure but
  2. increased it even more then intended: self.ch_idx is a ChannelIndex enum with values such that bounds checks from the PAC are eliminated. However, moving the cast out of the closure loses that information when compiling the PAC accessors.

In the end, the compiler chose not to inline the closure anymore, such that the bounds checks (one for each bit!) remained in the binary and the code is not optimized as part of the caller where the events argument is always known at compile time.

I've now moved the cast to inside the closure, and also added #[inline] attributes on the closure, which should avoid such issues.

@wisp3rwind wisp3rwind marked this pull request as ready for review October 10, 2025 18:38
@bugadani
Copy link
Contributor

Could you please move the changelog/migration guide to the next version?

regression from esp-rs#4174 (arguable,
before that, the code was just wrong)
this seems slighly cleaner, since it more clearly indicates how the two
are related
removing the redundant prefix

this isn't breaking API, since both types are private
the previous name was quite unwieldy, and this also better shows the
symmetry between RxTransaction and TxTransaction
Note that it's important how the closures for PAC register `modify` and
`write` calls are layed out. Compare

```rust
pub fn clear_tx_interrupts(&self) {
    let rmt = crate::peripherals::RMT::regs();

    rmt.int_clr().write(|w| {
	let ch_idx = self.ch_idx as u8;

	w.ch_tx_end(ch_idx).set_bit();
	w.ch_tx_err(ch_idx).set_bit();
	w.ch_tx_loop(ch_idx).set_bit();
	w.ch_tx_thr_event(ch_idx).set_bit()
    });
}
```

and

```rust
pub fn clear_tx_interrupts(&self) {
    let rmt = crate::peripherals::RMT::regs();
    let ch_idx = self.ch_idx as u8;

    rmt.int_clr().write(|w| {
	w.ch_tx_end(ch_idx).set_bit();
	w.ch_tx_err(ch_idx).set_bit();
	w.ch_tx_loop(ch_idx).set_bit();
	w.ch_tx_thr_event(ch_idx).set_bit()
    });
}
```

In the first example, the cast from `self.ch_idx: ChannelIndex` to `u8`
happens in the closure, whereas in the second case, the closure will be
built without the information about the restricted range of
`ChannelIndex` and assume `ch_idx` to be any `u8`. That leads to bounds
check being present (from the PAC register writer), which greatly
increases the closure size, and might push it above the compiler's
threshold for inlining. However, we generally rely quite a bit on
inlining with these methods to generate efficient code here and in
similar methods.
In practice, this happened anyway, but better enforce this:
- many of these methods are designed to allow for optimizations after
  inling (e.g. removing conditionals due to compile-time known
  values)
- if they end up being not inlined, the place_rmt_driver_in_ram feature
  doesn't take full effect
Since it is Copy. We expect all of this to be inlined, so there
shouldn't be a difference in the final code. This should however
simplify the compiler's job a tiny bit and more importantly prevent
pointer indirection and spilling to stack if one of these methods ends
up being not inlined for whatever reason.
more consistent with the new LoopMode type, and avoids suggesting that
this method might do anything like enabling an interrupt (which it
doesn't)
instead of permanently consuming them.

This requires quite a bit of refactoring since we must not call
pin.into() before verifying all inputs in order to be able to return the
original pin type on error.
instead of consuming it, which gives no way to recover/recreate the
channel
By adding an EnumSet argument. This nicely mirrors the listen/unlisten
functions, which take the same type of argument. It also helps to avoid
proliferation of a large number of methods when using more interrupts
(such as the loopcount interrupt, which I intend to do in a later PR).

There should be no performance negative, since these methods are always
inlined with compile-time known arguments, such that the compiler can
remove all conditionals.
This make the code slightly more concise and obviously branch-free
(although the compiler seems to make that optimization on it's own, as
well).

This code remains equivalent, because int_clr has only a write-to-clear
bits and the register writer will be initialized to all zeroes before
the closure is invoked.
- move configure_tx, configure_rx out of Channel again so that they
  don't end up being duplicated via monomorphization of Dm = Blocking and Async
  in spite of being virtually identical.
  It's probably a bit of an edge case to have both Blocking and Async
  channels in one binary (the HIL test do, though), so it's not really
  an important optimization, but it still seems cleaner this way.

- both methods take the configuration by reference instead of by value
  (it ended up being passed on the stack anyway, so this doesn't really
  change the generated code, but allows the caller to re-use the
  configuration)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants