RMT: Use new Encoder trait for Tx data #4604

wisp3rwind · 2025-12-02T20:10:29Z

Thank you for your contribution!

We appreciate the time and effort you've put into this pull request.
To help us review it efficiently, please ensure you've gone through the following checklist:

Submission Checklist 📝

I have updated existing examples or added new ones (if applicable).
I have used cargo xtask fmt-packages command to ensure that all changed code is formatted correctly.
My changes were added to the CHANGELOG.md in the proper section.
I have added necessary changes to user code to the latest Migration Guide.
My changes are in accordance to the esp-rs developer guidelines

Extra:

I have read the CONTRIBUTING.md guide and followed its instructions.

Pull Request Details 📖

This changes the input data type for RMT Tx methods from &[PulseCode] to &mut impl Encoder where Encoder is conceptually similar to Iterator<Item = PulseCode>, but allows for more efficient code in many cases.

IDF has a similar encoder type: https://docs.espressif.com/projects/esp-idf/en/latest/esp32c3/api-reference/peripherals/rmt.html#rmt-rmt-encoder

General design

The Encoder trait differs from Iterator<Item = PulseCode> mainly in the following aspects:

Writing data is driven by the encode method, which calls RmtWriter methods to push to the hardware, rather than RmtWriter pulling data from an iterator.
The RmtWriter::write_many method helps write several codes to the hardware in a very tight loop.

The combination of both allows achieving very efficient inner loops when copying data to the hardware, without requiring unsafe code on the user side and without exposing any direct hardware access or any specifics about how much data is written to the user code: See for example the BytesEncoder implementation. I've not been able to achieve the same performance with just Iterators.

Specifically, the pattern from BytesEncoder is similar to what's required to send data to WS2812-style LEDs:
Fetch R, G, B bytes from RAM, assemble in the correct order into a u32, then shift out bits and write a PulseCode for each. That maps very cleanly to write_many, but leads to overhead with iterators.

I've been benchmarking this¹ using cycle counters, here are some results for a WS2812 LED stripe encoder/pulse code iterator which are the fastest I've been able to achieve:

// base case: not using Encoder, just write zeros to the hardware using raw pointers
Render benchmark (base): RMT BenchmarkResult:
	CPU clock: 160MHz
	Iterations: 39
	Codes written: 1441
	Encoding time: 126us
	Encoding time / code: 88ns ~ 14 cycles

// pre-compute PulseCodes and use CopyEncoder
Render benchmark (slice): RMT BenchmarkResult:
	CPU clock: 160MHz
	Iterations: 32
	Codes written: 1441
	Encoding time: 154us (9% of 1710us tx time)
	Encoding time / code: 107ns ~ 17 cycles

// custom impl of Iterator<Item = PulseCode> for an LED stripe encoder type
Render benchmark (iter): RMT BenchmarkResult:
	CPU clock: 160MHz
	Iterations: 20
	Codes written: 1441
	Encoding time: 249us (14% of 1710us tx time)
	Encoding time / code: 172ns ~ 27 cycles

// custom impl of Encoder for an LED stripe encoder type
Render benchmark (enc): RMT BenchmarkResult:
	CPU clock: 160MHz
	Iterations: 29
	Codes written: 1441
	Encoding time: 170us (9% of 1710us tx time)
	Encoding time / code: 118ns ~ 18 cycles

"Encoding time" is just the time to run the encoder_write function, not including any polling or interrupt/embassy dispatch overhead. Thus, the fact that it takes "only" ~10% of tx time is a bit misleading.

Note that the custom encoder version has a 40% lower cycle count compared to the iterator version, and is on par with with the CopyEncoder (which requires precomputing PulseCodes in a large buffer). Both encoder variants are quite close to the performance ceiling of the base case, which I presume is limited due to the APB speed.

In this case, the inner loop of the encoder compiles to optimal assembly, cf. the decompiled version²:

whereas the iterator version remains more convoluted.

API

The PR continues to use a single transmit() method for various data types, requiring explictly wrapping things in an Encoder:

let mut enc = CopyEncoder::new(&data);
channel.transmit(&mut enc)?;

I also considered an IntoEncoder trait with fn transmit(&mut self, data: impl IntoEncoder) with implementations provided for

&[PulseCode],
I where I: IntoIterator<Item = PulseCode>`,
E where E: Encoder.

However, that immediately runs into issues with specialization due to the blanket impls.

Alternatively, one could consider different transmit method, i.e.

fn transmit_slice(&mut self, data: &[PulseCode]) -> ...,
fn transmit_iter(&mut self, data: impl IntoIterator<Item = PulseCode>) -> ...,
fn transmit_enc(&mut self, data: impl Encoder) -> ....

The disadvantage is that this blows up the number of methods significantly. In particular, if/when methods are split into transmit(&mut self, ...) and transmit_owned(self, ...) similar to the SHA driver, as suggested by @Dominaezzz, this would lead to combinatorial explosion of the number of channel methods. Additionally, having such per-datatype methods isn't really much simpler than explictly creating encoders, in my opinion.

Questions

I mentioned that I added some benchmarking code: This needs support in esp-hal for low-level hardware access. Would something like this in principle be in-scope for the project?
Should BytesEncoder be part of esp-hal directly? It might make more sense to move it to an example, showcasing how to write an efficient Encoder`.

Testing

HIL tests, incl. new ones.

Closes #1768

I intend to propose to merge the benchmarking code into esp-hal, but I'm not sure about the design yet, and it probably needs some cleanup. ↩
From esp32c3; the last ptr_ = ptr assignment is spurious, there's no corresponding instruction in the loop. data_word holds 24 bits of RGB data which are shifted out MSB-first. ↩

in anticipation of adding another, user-visible type which will be named RmtWriter This type is private, so no changelog or migration guide entry required.

which was probably of little value, anyway, and also in preparation for adding more sophisticated Encoder data types

…writer.state read-only

Previously, we stored &mut dyn Encoder and dynamically dispatched the Encoder::encode method. Now, we store &mut dyn EncoderExt and dynamically dispatch EncoderExt::write with the expectation that Encoder::encode should be inlined in EncoderExt::write (which is the only caller, and its Encoder implementations in esp-hal are also marked as #[inline(always)]). This might allow for small optimizations, since the RmtWriter type will typically not need to be constructed on the stack, but can be kept in registers.

to ensure that the Encoder-related refactoring didn't break anything

&mut dyn EncoderExt is a fat pointer to the data and the vtable (likely in flash), but we only need a single entry of the vtable. Thus, implement our own pointer type, which will avoids the indirection via the vtable.

wisp3rwind · 2025-12-02T20:44:56Z

Would you mind granting me HIL access here? Thanks!

NonNull::from_mut is new in 1.89

bugadani · 2025-12-02T21:04:05Z

esp-hal/src/rmt/writer.rs

+#[derive(Clone, Debug)]
+pub struct IterEncoder<D>
+where
+    D: Iterator<Item = PulseCode>,
+{
+    data: D,
+}

-        // If the input data was not exhausted, update offset as
-        //
-        // | initial | offset      | max_count   | new offset  |
-        // | ------- + ----------- + ----------- + ----------- |
-        // | true    | 0           | memsize     | 0           |
-        // | false   | 0           | memsize / 2 | memsize / 2 |
-        // | false   | memsize / 2 | memsize / 2 | 0           |
-        //
-        // Otherwise, the new position is invalid but the new slice is empty and we won't use the
-        // offset again. In either case, the unsigned subtraction will not underflow.
-        self.offset = memsize as u16 - max_count as u16 - self.offset;
-
-        // The panic can never trigger since count <= data.len()!
-        data.split_off(..count).unwrap();
-        if data.is_empty() {
-            self.state = WriterState::Done;
+impl<D> IterEncoder<D>
+where
+    D: Iterator<Item = PulseCode>,
+{
+    /// Create a new instance that transmits the provided `data`.
+    pub fn new(data: impl IntoIterator<IntoIter = D>) -> Self {
+        Self {
+            data: data.into_iter(),
+        }
+    }
+}


Do you actually need anything more than this? Copying from a slice, or converting from a bitstream can both be expressed as an iterator. Wouldn't it be better to not introduce a whole subsystem for something that could be formulated in user code with common enough Rust machinery?

Conceptually that's true, but I've not been able to optimize the code using just iterators as well as using the dedicated encoder type. It seems that this would require too much re-ordering of conditionals and eliding memory accesses by the compiler. I've amended the top post with more details on how I ended up with this design. Let me know if you have any further questions!

I'm not sure how I feel about designing a complicated API just to work around compiler optimization issues.

github-actions · 2025-12-02T21:12:35Z

[HIL trust list]

Trusted users for this PR (click to expand)

@wisp3rwind

github-actions · 2025-12-02T21:12:36Z

Author @wisp3rwind was trusted for this PR via the trusted-author label.
They can now use /hil quick or /hil full.

wisp3rwind · 2025-12-03T11:58:30Z

/hil full

github-actions · 2025-12-03T11:59:30Z

Triggered full HIL run for #4604.

Run: https://github.com/esp-rs/esp-hal/actions/runs/19893037493

Status update: ❌ HIL (full) run failed (conclusion: failure).

wisp3rwind added 12 commits December 2, 2025 20:50

RMT: rename RmtWriter -> WriterContext

9cca6b8

in anticipation of adding another, user-visible type which will be named RmtWriter This type is private, so no changelog or migration guide entry required.

RMT: Mark WriterContext::new as #[inline]

7d44ec7

RMT: Remove Into<PulseCode> / From<PulseCode> support

df1d667

which was probably of little value, anyway, and also in preparation for adding more sophisticated Encoder data types

RMT: add Encoder, revise error handling, track total in writer, make …

3678730

…writer.state read-only

RMT: add HIL tests for more tx error cases

b2036b9

to ensure that the Encoder-related refactoring didn't break anything

RMT: Add IterEncoder

cf377d7

RMT: add IterEncoder HIL test

a126041

RMT: add BytesEncoder

e64cb59

RMT: add BytesEncoder HIL test

9eeba92

RMT: introduce EncoderRef as a more efficient &mut dyn EncoderExt

5da220d

&mut dyn EncoderExt is a fat pointer to the data and the vtable (likely in flash), but we only need a single entry of the vtable. Thus, implement our own pointer type, which will avoids the indirection via the vtable.

RMT: changelog & migration guide for Encoder

a2e2ce1

fix for MSRV 1.88

36c5264

NonNull::from_mut is new in 1.89

bugadani reviewed Dec 2, 2025

View reviewed changes

wisp3rwind force-pushed the rmt-encoder-v3 branch from b04afae to 36c5264 Compare December 2, 2025 21:04

bugadani added the trusted-author Allow the author of this Pull Request to run HIL tests and the `binary-size` test. label Dec 2, 2025

wisp3rwind mentioned this pull request Dec 3, 2025

RMT driver tracking issue #3930

Open

12 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

RMT: Use new Encoder trait for Tx data #4604

RMT: Use new Encoder trait for Tx data #4604

wisp3rwind commented Dec 2, 2025 •

edited

Loading

Uh oh!

wisp3rwind commented Dec 2, 2025

Uh oh!

bugadani Dec 2, 2025

Uh oh!

wisp3rwind Dec 3, 2025

Uh oh!

bugadani Dec 3, 2025

Uh oh!

github-actions bot commented Dec 2, 2025

Uh oh!

github-actions bot commented Dec 2, 2025

Uh oh!

wisp3rwind commented Dec 3, 2025

Uh oh!

github-actions bot commented Dec 3, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

RMT: Use new Encoder trait for Tx data #4604

Are you sure you want to change the base?

RMT: Use new Encoder trait for Tx data #4604

Conversation

wisp3rwind commented Dec 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Thank you for your contribution!

Submission Checklist 📝

Extra:

Pull Request Details 📖

General design

API

Questions

Testing

Footnotes

Uh oh!

wisp3rwind commented Dec 2, 2025

Uh oh!

bugadani Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

wisp3rwind Dec 3, 2025

Choose a reason for hiding this comment

Uh oh!

bugadani Dec 3, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Dec 2, 2025

[HIL trust list]

Uh oh!

github-actions bot commented Dec 2, 2025

Uh oh!

wisp3rwind commented Dec 3, 2025

Uh oh!

github-actions bot commented Dec 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

wisp3rwind commented Dec 2, 2025 •

edited

Loading

github-actions bot commented Dec 3, 2025 •

edited

Loading