You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+11-6Lines changed: 11 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,18 +5,20 @@ Base64 is a standard approach to represent any binary data as ASCII. It is part
5
5
standard (MIME) and is commonly used to embed data in XML, HTML or JSON files. For example,
6
6
images can be encoded as text using base64. Base64 is also used to represent cryptographic keys.
7
7
8
-
Our processors have SIMD instructions that are ideally suited to encode and decode base64.
8
+
Our processors have fast instructions (SIMD) that can process blocks of data at once. They are ideally
9
+
suited to encode and decode base64.
10
+
The C# .NET runtime library has fast (SIMD-based) base64 functions[^1] when the input is UTF-8.
11
+
9
12
Encoding is somewhat easier than decoding. Decoding is a more challenging problem than base64 encoding because
10
13
of the presence of allowable white space characters and the need to validate the input. Indeed, all
11
14
inputs are valid for encoding, but only some inputs are valid for decoding. Having to skip white space
12
15
characters makes accelerated decoding somewhat difficult. We refer to this decoding as WHATWG forgiving-base64 decoding.
13
16
14
-
The C# standard library has fast (SIMD-based) base64 encoding functions. It also has fast decoding
15
-
functions. Yet these accelerated base64 decoding functions for UTF-8 inputs in the .NET runtime are not optimal:
16
-
we beat them by 1.7 x to 2.3 x on inputs of a few kilobytes by using a novel different algorithm.
17
-
This fast WHATWG forgiving-base64 algorithm is already used in major JavaScript runtimes (Node.js and Bun).
17
+
To handle spaces and validation, we recently designed faster base64 decoding algorithm. It has been deployed
18
+
in the [simdutf](https://github.com/simdutf/simdutf) C++ library and used in production systems (e.g., the JavaScript runtime systems Node.js and Bun).
19
+
With this new algorithm, we beat the C# .NET runtime functions by 1.7 x to 2.3 x on realistic inputs of a few kilobytes.
18
20
19
-
A full description of the new algorithm will be published soon. The algorithm is unpatented (free) and we make our
21
+
The algorithm is unpatented (free) and we make our
20
22
C# code available under a liberal open-source licence (MIT).
21
23
22
24
@@ -169,3 +171,6 @@ You can convert an integer to a hex string like so: `$"0x{MyVariable:X}"`.
[^1]: The .NET runtime appear to have received some of its fast SIMD base64 functions from [gfoidl.Base64](https://github.com/gfoidl/Base64) who built on earlier work by Klomp, Muła and others. See [Faster Base64 Encoding and Decoding using AVX2 Instructions](https://arxiv.org/abs/1704.00605) for a review.
0 commit comments