Streaming base64 encode/decode by ThePseudo · Pull Request #8622 · uutils/coreutils

ThePseudo · 2025-09-12T08:14:06Z

On the main branch, the encode and decode operations look at the file ahead-of-time to gather information about padding. However, padding only appears at the end, and the rest of the file can be encoded and decoded disregarding the padding.

The main issue with the file being read ahead-of-time is that we need the entire file to be available from the beginning. This is in contrast with a use case that can be streaming data: imagine you have a web socket, the sender sends base64-encoded data, but the receiver can only translate it in the end, making real-time communication impossible.

Moreover, reading the entire file from the beginning means that it needs to stay in RAM the whole time. For smaller files it is not a problem, but when encoding to base64 few gigabytes of file this can be an issue, as it could easily saturate the main memory when reading the file.

This patch is aimed to solve the issue of the ahead-of-time reading. First, we do not check for padding, but let the decoder work for us: as said earlier, most of the encoded file does not have padding, and there is a 1/3 probability that there is no padding in the end. The STANDARD_NO_PAD base64 decoder used produces an error if padding is present; if so, we resort to the STANDARD base64 decoder. This is how the problem of the padding ahead-of-time is solved.

Also, please notice that the encoder does not need any ahead-of-time knowledge of padding, since it is the encoder itself that generates it.

For the benchmarking:
coreutils base64 refers to this PR version
coreutils_main_branch base64 refers to the version that is on the main branch
base64 refers to GNU Coreutils base64

As this is partially also a performance-related patch, I will paste the hyperfine analysis:

For encoding:

Benchmark 1: ./coreutils base64 model-00001-of-000163.safetensors
  Time (mean ± σ):      2.423 s ±  0.039 s    [User: 0.997 s, System: 1.424 s]
  Range (min … max):    2.393 s …  2.524 s    10 runs
 
Benchmark 2: ./coreutils_main_branch base64 model-00001-of-000163.safetensors
  Time (mean ± σ):      4.111 s ±  0.035 s    [User: 1.172 s, System: 2.937 s]
  Range (min … max):    4.052 s …  4.158 s    10 runs
 
Benchmark 3: base64 model-00001-of-000163.safetensors
  Time (mean ± σ):      4.000 s ±  0.016 s    [User: 3.054 s, System: 0.941 s]
  Range (min … max):    3.976 s …  4.033 s    10 runs
 
Summary
  ./coreutils base64 model-00001-of-000163.safetensors ran
    1.65 ± 0.03 times faster than base64 model-00001-of-000163.safetensors
    1.70 ± 0.03 times faster than ./coreutils_main_branch base64 model-00001-of-000163.safetensors

For decoding:

Benchmark 1: ./coreutils base64 -d base64.txt
  Time (mean ± σ):      9.442 s ±  0.060 s    [User: 7.622 s, System: 1.814 s]
  Range (min … max):    9.373 s …  9.580 s    10 runs
 
Benchmark 2: ./coreutils_main_branch base64 -d base64.txt
  Time (mean ± σ):      9.504 s ±  0.201 s    [User: 5.766 s, System: 3.727 s]
  Range (min … max):    9.309 s …  9.882 s    10 runs
 
Benchmark 3: base64 -d base64.txt
  Time (mean ± σ):      8.362 s ±  0.140 s    [User: 6.750 s, System: 1.605 s]
  Range (min … max):    8.155 s …  8.527 s    10 runs
 
Summary
  base64 -d base64.txt ran
    1.13 ± 0.02 times faster than ./coreutils base64 -d base64.txt
    1.14 ± 0.03 times faster than ./coreutils_main_branch base64 -d base64.txt

For memory consumption, using ps and grep on the 3 implementation variants working on the same file to gather the memory used, I will put the 3 values near each other to compare. I will report the entire line, since it has no sensitive information for me.

This approach is feasible because the memory footprint remains stable during the program execution: after the file is loaded/memory is allocated, there is no more large allocations that take place (except, maybe, inside of the fast_encoder/decoder in the base64_simd crate, which is shown by the flamegraph tool (I used flamegraph, which also generates an svg to explore) (image at the end of this PR).

For encoding:

andrea    167746  100  0.0  15880  6616 pts/6    R+   10:08   0:01 ./coreutils base64 model-00001-of-000163.safetensors
andrea    168813  102  6.1 5127348 1894336 pts/6 R+   10:10   0:00 ./coreutils_main_branch base64 model-00001-of-000163.safetensors
andrea    169415  100  0.0   8392  2272 pts/6    R+   10:11   0:02 base64 model-00001-of-000163.safetensors

For decoding:

andrea    164864  100  0.0  15876  6288 pts/6    R+   10:01   0:01 ./coreutils base64 -d base64.txt
andrea    165735  125  0.7 6920844 233384 pts/6  R+   10:03   0:00 ./coreutils_main_branch base64 -d base64.txt
andrea    166374  100  0.0   8388  2208 pts/6    R+   10:05   0:03 base64 -d base64.txt

the issue we still have is that memory usage is double with respect to the GNU Coreutils implementation, but it also does not increase with the size of the file.

Malloc inside base64_simd:

github-actions · 2025-09-12T08:35:19Z

GNU testsuite comparison:

Skip an intermittent issue tests/timeout/timeout (fails in this run but passes in the 'main' branch)

sylvestre · 2025-09-12T12:33:30Z

Could you please share your example file? I don't get the same results

ThePseudo · 2025-09-12T13:23:36Z

Uhm it is almost 5 GB large... maybe I can try with a smaller one? What do you suggest?

Nevermind, I found it back online... it is one of the models for DeepSeek, those are available here. https://huggingface.co/deepseek-ai/DeepSeek-V3/tree/main

Probably a good option is selecting this one: https://huggingface.co/deepseek-ai/DeepSeek-V3/resolve/main/model-00001-of-000163.safetensors?download=true

It is roughly the same size

github-actions · 2025-09-15T07:39:39Z

GNU testsuite comparison:

Skip an intermittent issue tests/misc/stdbuf (fails in this run but passes in the 'main' branch)
Skip an intermittent issue tests/timeout/timeout (fails in this run but passes in the 'main' branch)

ThePseudo · 2025-09-15T10:01:56Z

I re-ran the tests with the file linked above:

For encoding:

Benchmark 1: ./coreutils base64 model-00001-of-000163.safetensors
  Time (mean ± σ):      2.152 s ±  0.066 s    [User: 0.952 s, System: 1.199 s]
  Range (min … max):    2.092 s …  2.301 s    10 runs
 
Benchmark 2: ./coreutils_main_branch base64 model-00001-of-000163.safetensors
  Time (mean ± σ):      3.759 s ±  0.119 s    [User: 1.140 s, System: 2.619 s]
  Range (min … max):    3.616 s …  3.976 s    10 runs
 
Benchmark 3: base64 model-00001-of-000163.safetensors
  Time (mean ± σ):      3.723 s ±  0.032 s    [User: 3.044 s, System: 0.679 s]
  Range (min … max):    3.687 s …  3.783 s    10 runs
 
Summary
  ./coreutils base64 model-00001-of-000163.safetensors ran
    1.73 ± 0.05 times faster than base64 model-00001-of-000163.safetensors
    1.75 ± 0.08 times faster than ./coreutils_main_branch base64 model-00001-of-000163.safetensors

For decoding:

Benchmark 1: ./coreutils base64 -d base64.txt
  Time (mean ± σ):      9.167 s ±  0.101 s    [User: 7.637 s, System: 1.499 s]
  Range (min … max):    9.063 s …  9.347 s    10 runs
 
Benchmark 2: ./coreutils_main_branch base64 -d base64.txt
  Time (mean ± σ):      9.329 s ±  0.020 s    [User: 5.620 s, System: 3.669 s]
  Range (min … max):    9.301 s …  9.380 s    10 runs
 
  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
 
Benchmark 3: base64 -d base64.txt
  Time (mean ± σ):      8.038 s ±  0.037 s    [User: 6.471 s, System: 1.536 s]
  Range (min … max):    7.991 s …  8.104 s    10 runs
 
Summary
  base64 -d base64.txt ran
    1.14 ± 0.01 times faster than ./coreutils base64 -d base64.txt
    1.16 ± 0.01 times faster than ./coreutils_main_branch base64 -d base64.txt

The system is also on some load, so it might be slower than usual, but more or less the results stay consistent with what reported before. Please let me know if there is any difference.

github-actions · 2025-09-16T06:28:53Z

GNU testsuite comparison:

Skip an intermittent issue tests/timeout/timeout (fails in this run but passes in the 'main' branch)

github-actions · 2025-09-17T07:33:20Z

GNU testsuite comparison:

Skip an intermittent issue tests/misc/tee (fails in this run but passes in the 'main' branch)

github-actions · 2025-09-17T09:44:26Z

GNU testsuite comparison:

Skipping an intermittent issue tests/timeout/timeout (passes in this run but fails in the 'main' branch)

github-actions · 2025-09-18T07:48:15Z

GNU testsuite comparison:

Skip an intermittent issue tests/timeout/timeout (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/misc/tee (passes in this run but fails in the 'main' branch)

github-actions · 2025-09-19T14:06:05Z

GNU testsuite comparison:

Skipping an intermittent issue tests/misc/stdbuf (passes in this run but fails in the 'main' branch)
Skipping an intermittent issue tests/timeout/timeout (passes in this run but fails in the 'main' branch)

codspeed-hq · 2025-09-20T18:29:49Z

CodSpeed Performance Report

Merging #8622 will not alter performance

_{Comparing ThePseudo:streamline_b64_decode (c3c77fd) with main (9225670)}

Summary

✅ 123 untouched
⏩ 5 skipped¹

5 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩

github-actions · 2025-09-20T18:44:50Z

GNU testsuite comparison:

Skip an intermittent issue tests/misc/tee (fails in this run but passes in the 'main' branch)
Skip an intermittent issue tests/timeout/timeout (fails in this run but passes in the 'main' branch)

src/uu/base32/src/base_common.rs

github-actions · 2025-09-22T06:57:19Z

GNU testsuite comparison:

Skip an intermittent issue tests/misc/tee (fails in this run but passes in the 'main' branch)

github-actions · 2025-09-22T08:23:47Z

GNU testsuite comparison:

Skip an intermittent issue tests/misc/tee (fails in this run but passes in the 'main' branch)

github-actions · 2025-09-22T12:55:00Z

GNU testsuite comparison:

Skip an intermittent issue tests/misc/tee (fails in this run but passes in the 'main' branch)

github-actions · 2025-09-23T06:28:42Z

GNU testsuite comparison:

Skip an intermittent issue tests/misc/tee (fails in this run but passes in the 'main' branch)
Skip an intermittent issue tests/timeout/timeout (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/misc/stdbuf (passes in this run but fails in the 'main' branch)

github-actions · 2025-09-24T07:44:22Z

GNU testsuite comparison:

Skip an intermittent issue tests/misc/tee (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/timeout/timeout (passes in this run but fails in the 'main' branch)
Congrats! The gnu test tests/tail/overlay-headers is no longer failing!

github-actions · 2025-10-13T12:00:06Z

GNU testsuite comparison:

Skip an intermittent issue tests/misc/tee (fails in this run but passes in the 'main' branch)

github-actions · 2025-10-16T06:41:36Z

GNU testsuite comparison:

Skip an intermittent issue tests/timeout/timeout (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/misc/tee (passes in this run but fails in the 'main' branch)

github-actions · 2025-10-17T10:05:40Z

GNU testsuite comparison:

Skip an intermittent issue tests/misc/tee (fails in this run but passes in the 'main' branch)
Skip an intermittent issue tests/timeout/timeout (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/tail/overlay-headers (passes in this run but fails in the 'main' branch)

github-actions · 2025-10-21T08:48:25Z

GNU testsuite comparison:

Skip an intermittent issue tests/tail/overlay-headers (fails in this run but passes in the 'main' branch)

src/uucore/src/lib/features/encoding.rs

github-actions · 2025-10-21T12:14:44Z

GNU testsuite comparison:

Skip an intermittent issue tests/timeout/timeout (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/misc/tee (passes in this run but fails in the 'main' branch)

ThePseudo · 2025-10-22T10:58:39Z

Issues with base58 not supporting streaming are solved by the latest commit. We should provide some information about whether the algorithm supports streaming. I modified the trait responsible for managing the chunk size, so now this is working properly.
The other algorithms support streaming, and even base58 decoding supports it, so that stays unchanged (except that there is logic for supporting also the non-streaming variant now).

github-actions · 2025-10-23T07:07:17Z

GNU testsuite comparison:

Skipping an intermittent issue tests/misc/tee (passes in this run but fails in the 'main' branch)
Skipping an intermittent issue tests/timeout/timeout (passes in this run but fails in the 'main' branch)

github-actions · 2025-11-05T07:36:08Z

GNU testsuite comparison:

Skipping an intermittent issue tests/tail/overlay-headers (passes in this run but fails in the 'main' branch)

This should remove the dependency we have in knowing whether the final message has padding or not. This is the first step to not have a ahead-of-time loading of the entire message to encode/decode, and allow for streaming. Signed-off-by: Andrea Calabrese <andrea.calabrese@amarulasolutions.com>

As per title, this is the main feature of this patch set. First, by avoiding looking for the final padding, there is the ability to read data streaming in before the stream finished producing them. This also enables the tool to work with much less memory needed, essentially making it a fixed amount instead of tepending by the file size. Signed-off-by: Andrea Calabrese <andrea.calabrese@amarulasolutions.com>

We read linearly, so we do not need to seek within a file Signed-off-by: Andrea Calabrese <andrea.calabrese@amarulasolutions.com>

base58 does not support streaming when encoding. This patch allows base58 and other not-streaming algorithms to work with the new streaming mechanism. Signed-off-by: Andrea Calabrese <andrea.calabrese@amarulasolutions.com>

Signed-off-by: Andrea Calabrese <andrea.calabrese@amarulasolutions.com>

github-actions · 2025-11-11T08:51:36Z

GNU testsuite comparison:

GNU test failed: tests/basenc/base64. tests/basenc/base64 is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/basenc/basenc. tests/basenc/basenc is passing on 'main'. Maybe you have to rebase?

sylvestre · 2025-12-26T23:07:50Z

many jobs are still failing, could you please have a look? thanks

ThePseudo marked this pull request as ready for review September 12, 2025 08:17

ThePseudo mentioned this pull request Sep 12, 2025

Base64 does not support streaming data #8625

Open

ThePseudo force-pushed the streamline_b64_decode branch from 8f5d7b1 to c38288b Compare September 15, 2025 07:18

ThePseudo force-pushed the streamline_b64_decode branch from c38288b to 8e0969e Compare September 16, 2025 06:07

ThePseudo force-pushed the streamline_b64_decode branch from 8e0969e to 44147d1 Compare September 17, 2025 07:11

ThePseudo force-pushed the streamline_b64_decode branch from 44147d1 to 05d7d9f Compare September 17, 2025 09:24

ThePseudo force-pushed the streamline_b64_decode branch from 05d7d9f to 1854b91 Compare September 18, 2025 07:27

Nekrolm reviewed Sep 21, 2025

View reviewed changes

src/uu/base32/src/base_common.rs Outdated Show resolved Hide resolved

ThePseudo force-pushed the streamline_b64_decode branch 2 times, most recently from bcd2ec4 to 1854b91 Compare September 22, 2025 08:01

ThePseudo force-pushed the streamline_b64_decode branch 2 times, most recently from 23bc39f to 1bc46e6 Compare September 22, 2025 12:35

ThePseudo force-pushed the streamline_b64_decode branch from e9882d5 to f60b1b9 Compare September 24, 2025 13:36

ThePseudo force-pushed the streamline_b64_decode branch from 88e3b2b to 9bdac31 Compare October 13, 2025 11:38

ThePseudo force-pushed the streamline_b64_decode branch from 9bdac31 to 87978e0 Compare October 16, 2025 06:16

ThePseudo force-pushed the streamline_b64_decode branch from 87978e0 to 76cb7e6 Compare October 17, 2025 09:45

ThePseudo force-pushed the streamline_b64_decode branch from 76cb7e6 to 8fd5dc2 Compare October 21, 2025 08:26

sylvestre reviewed Oct 21, 2025

View reviewed changes

src/uucore/src/lib/features/encoding.rs Outdated Show resolved Hide resolved

ThePseudo force-pushed the streamline_b64_decode branch 4 times, most recently from fe89a32 to 76cb42b Compare October 21, 2025 11:51

ThePseudo force-pushed the streamline_b64_decode branch from 76cb42b to bbc2458 Compare October 23, 2025 06:45

ThePseudo force-pushed the streamline_b64_decode branch from bbc2458 to 22f8c10 Compare November 5, 2025 07:15

ThePseudo force-pushed the streamline_b64_decode branch from 22f8c10 to 024c137 Compare November 6, 2025 09:52

Andrea Calabrese added 4 commits November 11, 2025 08:46

Remove Seek from required traits

d38a1f4

We read linearly, so we do not need to seek within a file Signed-off-by: Andrea Calabrese <andrea.calabrese@amarulasolutions.com>

Add support for algorithms that do not support streaming

c7cf018

base58 does not support streaming when encoding. This patch allows base58 and other not-streaming algorithms to work with the new streaming mechanism. Signed-off-by: Andrea Calabrese <andrea.calabrese@amarulasolutions.com>

ThePseudo force-pushed the streamline_b64_decode branch 2 times, most recently from a74141f to 1ed7176 Compare November 11, 2025 08:12

basenc: fix merge issues and removed code

c3c77fd

Signed-off-by: Andrea Calabrese <andrea.calabrese@amarulasolutions.com>

ThePseudo force-pushed the streamline_b64_decode branch from 1ed7176 to c3c77fd Compare November 11, 2025 08:30

Uh oh!

Conversation

ThePseudo commented Sep 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Sep 12, 2025

Uh oh!

sylvestre commented Sep 12, 2025

Uh oh!

ThePseudo commented Sep 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Sep 15, 2025

Uh oh!

ThePseudo commented Sep 15, 2025

Uh oh!

github-actions bot commented Sep 16, 2025

Uh oh!

github-actions bot commented Sep 17, 2025

Uh oh!

github-actions bot commented Sep 17, 2025

Uh oh!

github-actions bot commented Sep 18, 2025

Uh oh!

github-actions bot commented Sep 19, 2025

Uh oh!

codspeed-hq bot commented Sep 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

CodSpeed Performance Report

Merging #8622 will not alter performance

Summary

Footnotes

Uh oh!

github-actions bot commented Sep 20, 2025

Uh oh!

Uh oh!

github-actions bot commented Sep 22, 2025

Uh oh!

github-actions bot commented Sep 22, 2025

Uh oh!

github-actions bot commented Sep 22, 2025

Uh oh!

github-actions bot commented Sep 23, 2025

Uh oh!

github-actions bot commented Sep 24, 2025

Uh oh!

github-actions bot commented Oct 13, 2025

Uh oh!

github-actions bot commented Oct 16, 2025

Uh oh!

github-actions bot commented Oct 17, 2025

Uh oh!

github-actions bot commented Oct 21, 2025

Uh oh!

Uh oh!

github-actions bot commented Oct 21, 2025

Uh oh!

ThePseudo commented Oct 22, 2025

Uh oh!

github-actions bot commented Oct 23, 2025

Uh oh!

github-actions bot commented Nov 5, 2025

Uh oh!

github-actions bot commented Nov 11, 2025

Uh oh!

sylvestre commented Dec 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ThePseudo commented Sep 12, 2025 •

edited

Loading

ThePseudo commented Sep 12, 2025 •

edited

Loading

codspeed-hq bot commented Sep 20, 2025 •

edited

Loading