Optimise read buf initialization performance by alexheretic · Pull Request #524 · snapview/tungstenite-rs

alexheretic · 2025-11-23T21:15:14Z

Add new InitAwareBuf wrapper logic that optimises repetitive zero-initialization of the read buffer when receiving messages. Particularly improves larger than default read_buffer_size performance.

Also see previous analysis.

This optimisation works by InitAwareBuf(BytesMut) keeping track of how much of the spare capacity has been previously initialised. This means when we resize + read + truncate we need only actually zero the necessary region of uninitialized bytes once.

Alternatives

I didn't find many better options than this custom optimisation wrapping BytesMut. I asked upstream and they don't plan on providing anything for this.

Also related if/when we get rust-lang/rust#78485 we can probably switch to using that and remove the InitAwareBuf wrapper.

Benchmarks

Benchmarks using the default 128KiB read buffer don't really change. The improvement is clear though if we set a large, e.g. 8MiB, read_buffer_size as this amplifies the amount of zeroing the current logic does.

So we could see this as a kind of performance fix for larger buffers. In theory it should optimise the default 128KiB buffer for small messages too, I just don't see it come across in our current benches.

Default read buffer (128KiB)

No noticeable difference.

group                init-aware-buf2                        master
-----                ---------------                        ------
send+recv/512 B      1.00     12.8±0.16µs    76.5 MB/sec    1.00     12.8±0.03µs    76.2 MB/sec
send+recv/4 KiB      1.00     14.7±0.21µs   529.8 MB/sec    1.02     15.1±0.38µs   518.9 MB/sec
send+recv/32 KiB     1.07     28.2±0.18µs     2.2 GB/sec    1.00     26.3±0.19µs     2.3 GB/sec
send+recv/256 KiB    1.08    115.3±0.13µs     4.2 GB/sec    1.00    106.8±0.97µs     4.6 GB/sec
send+recv/2 MiB      1.00   937.5±36.06µs     4.2 GB/sec    1.00   940.9±29.48µs     4.2 GB/sec
send+recv/16 MiB     1.09     15.4±0.42ms     2.0 GB/sec    1.00     14.2±0.41ms     2.2 GB/sec
send+recv/128 MiB    1.00    196.7±0.66ms  1301.6 MB/sec    1.00    197.2±7.01ms  1298.3 MB/sec
send+recv/1 GiB      1.07  1344.7±50.32ms  1523.1 MB/sec    1.00  1262.0±60.64ms  1622.9 MB/sec

8MiB read buffer

A significant improvement fixing the performance regression of using larger buffers.

group                init-aware-buf2-8mb                    master-8mb
-----                -------------------                    ----------
send+recv/512 B      1.00     12.5±0.25µs    77.9 MB/sec    6.12     76.6±2.42µs    12.7 MB/sec
send+recv/4 KiB      1.00     15.1±0.06µs   518.9 MB/sec    2.97     44.7±0.06µs   174.9 MB/sec
send+recv/32 KiB     1.00     26.8±0.17µs     2.3 GB/sec    1.91     51.2±1.27µs  1221.2 MB/sec
send+recv/256 KiB    1.00    120.6±1.32µs     4.0 GB/sec    1.55    187.0±2.86µs     2.6 GB/sec
send+recv/2 MiB      1.00  1125.5±33.24µs     3.5 GB/sec    1.05   1177.8±4.16µs     3.3 GB/sec
send+recv/16 MiB     1.06     14.5±0.09ms     2.2 GB/sec    1.00     13.7±0.07ms     2.3 GB/sec
send+recv/128 MiB    1.00    189.9±0.66ms  1347.8 MB/sec    1.04    197.7±2.72ms  1294.6 MB/sec
send+recv/1 GiB      1.00  1286.7±26.05ms  1591.6 MB/sec    1.00  1285.7±26.50ms  1592.9 MB/sec

src/protocol/frame/mod.rs

paolobarbolini · 2025-11-23T22:17:09Z

src/protocol/frame/init_aware_buf.rs

+impl AsRef<[u8]> for InitAwareBuf {
+    #[inline]
+    fn as_ref(&self) -> &[u8] {
+        &self.bytes
+    }
+}
+
+impl Deref for InitAwareBuf {
+    type Target = [u8];
+
+    #[inline]
+    fn deref(&self) -> &[u8] {
+        &self.bytes
+    }
+}
+
+impl AsMut<[u8]> for InitAwareBuf {
+    #[inline]
+    fn as_mut(&mut self) -> &mut [u8] {
+        &mut self.bytes
+    }
+}
+
+impl DerefMut for InitAwareBuf {
+    #[inline]
+    fn deref_mut(&mut self) -> &mut [u8] {
+        &mut self.bytes
+    }
+}


I came from the bytes issue. It's taking me some time to review this because of the manual slicing that happens in the other files. I'm not a fan of them, because in a way they leak implementation internals and leave the other modules to do the slicing by themselves. Although it's an internal API, it doesn't feel robust.

Why not copy the read_buf API (both the initial one which you can see in tokio, and the current one in std) by having methods like:

impl InitAwareBuf { // the region of memory that contains user data pub fn filled(&self) -> &[u8] {} // the region of memory that does not contain user data, but has been initialized pub fn init_mut(&mut self) -> &mut [u8] {} // mark `filled_len` bytes, that were written into the slice returned by `init_mut`, as filled pub fn advance_mut(&mut self, filled_len: usize) {} }

I envisioned the wrapper as transparent bytes but with extra info, the initialised capacity. Slice access, even mut, is fine as it doesn't grow or shrink the buffer.

This style simplifies the overall change as usage of the buf, previously plain BytesMut is fairly unchanged.

We just need ensure the new wrapper itself is sound.

src/protocol/frame/init_aware_buf.rs

alexheretic · 2025-11-30T04:22:05Z

I updated the benchmarks in the description "init-aware-buf2" to reflect the reworked implementation. The conclusions are the same.

src/protocol/frame/init_aware_buf.rs

daniel-abramov

Thanks @alexheretic for working on these optimizations. And thanks @paolobarbolini for reviewing it!

I have not spotted any issues so far, the only thing I'm wondering about is whether the additional complexity would result in noticeable performance improvements: judging by benchmarks, I see that while there is noticeable performance improvement when the read buffer size is set to 8 MiB, but at the same time it looks like increasing the buffer size to 8 MiB is not really that useful even for 1 GiB benchmark, because the improved buffer with 8 MiB read buffer has a similar performance as the master buffer with the default read buffer size.

P.S.: Btw, I updated the rust-version in master so that CI/CD does not fail. You might want to rebase :)

src/protocol/frame/init_aware_buf.rs

Particularly improves large read_buffer_size performance

Co-authored-by: Daniel Abramov <inetcrack2@gmail.com>

alexheretic · 2026-01-12T15:30:02Z

I see that while there is noticeable performance improvement when the read buffer size is set to 8 MiB, but at the same time it looks like increasing the buffer size to 8 MiB is not really that useful even for 1 GiB benchmark, because the improved buffer with 8 MiB read buffer has a similar performance as the master buffer with the default read buffer size.

Yes I agree with this analysis. It is also kinda why I didn't rush to make this optimisation initially. The optimisation itself does make sense, but we're missing a compelling use case for it. Considering the added complexity I'm ok with keeping this PR unmerged until there is some better use case that benefits.

On the other hand perhaps fixing perf for large configured buffers is desirable. Or perhaps could be later if we figure out why large messages perf is worse than 256KiB message perf.

I'm ok with whatever you want to do with this.

alexheretic mentioned this pull request Nov 23, 2025

Add fast-and-unsound feature #485

Closed

alexheretic commented Nov 23, 2025

View reviewed changes

src/protocol/frame/mod.rs Outdated Show resolved Hide resolved

alexheretic force-pushed the init-aware-read-buf branch 2 times, most recently from 979d592 to 8e0f19d Compare November 23, 2025 21:31

alexheretic mentioned this pull request Nov 23, 2025

std::io::Read into BytesMut efficiency tokio-rs/bytes#805

Closed

paolobarbolini reviewed Nov 23, 2025

View reviewed changes

src/protocol/frame/init_aware_buf.rs Outdated Show resolved Hide resolved

src/protocol/frame/init_aware_buf.rs Outdated Show resolved Hide resolved

paolobarbolini mentioned this pull request Nov 23, 2025

Document that BytesMut::{reserve,try_reserve} doesn't preserve unused capacity tokio-rs/bytes#808

Merged

paolobarbolini reviewed Nov 23, 2025

View reviewed changes

src/protocol/frame/init_aware_buf.rs Outdated Show resolved Hide resolved

alexheretic commented Nov 30, 2025

View reviewed changes

src/protocol/frame/init_aware_buf.rs Outdated Show resolved Hide resolved

alexheretic commented Nov 30, 2025

View reviewed changes

src/protocol/frame/init_aware_buf.rs Show resolved Hide resolved

daniel-abramov approved these changes Jan 5, 2026

View reviewed changes

src/protocol/frame/init_aware_buf.rs Outdated Show resolved Hide resolved

src/protocol/frame/init_aware_buf.rs Outdated Show resolved Hide resolved

alexheretic and others added 3 commits January 12, 2026 15:19

Optimise read buf initialization performance

73f074a

Particularly improves large read_buffer_size performance

Rework as using underlying bytes.len as capacity

2fddc07

Update src/protocol/frame/init_aware_buf.rs

ef3ea34

Co-authored-by: Daniel Abramov <inetcrack2@gmail.com>

alexheretic force-pushed the init-aware-read-buf branch from 2345f5f to ef3ea34 Compare January 12, 2026 15:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimise read buf initialization performance#524

Optimise read buf initialization performance#524
alexheretic wants to merge 3 commits intosnapview:masterfrom
alexheretic:init-aware-read-buf

alexheretic commented Nov 23, 2025 •

edited

Loading

Uh oh!

Uh oh!

paolobarbolini Nov 23, 2025 •

edited

Loading

Uh oh!

alexheretic Nov 24, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

alexheretic commented Nov 30, 2025

Uh oh!

Uh oh!

daniel-abramov left a comment

Uh oh!

Uh oh!

Uh oh!

alexheretic commented Jan 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

alexheretic commented Nov 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Alternatives

Benchmarks

Default read buffer (128KiB)

8MiB read buffer

Uh oh!

Uh oh!

paolobarbolini Nov 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alexheretic Nov 24, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

alexheretic commented Nov 30, 2025

Uh oh!

Uh oh!

daniel-abramov left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

alexheretic commented Jan 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

alexheretic commented Nov 23, 2025 •

edited

Loading

paolobarbolini Nov 23, 2025 •

edited

Loading