Skip to content

Commit acea3c2

Browse files
committed
Merge branch 'v.next' of github.com:rusticstuff/simdutf8 into v.next
2 parents af9c4e5 + 945b602 commit acea3c2

File tree

1 file changed

+20
-12
lines changed

1 file changed

+20
-12
lines changed

README.md

Lines changed: 20 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -14,8 +14,8 @@ fuzzing and there are no known bugs.
1414
## Features
1515
* `basic` API for the fastest validation, optimized for valid UTF-8
1616
* `compat` API as a fully compatible replacement for `std::str::from_utf8()`
17-
* Up to twenty times faster than the std library on non-ASCII, up to twice as fast on ASCII <-TBD!
18-
* Up to 28% faster on non-ASCII input compared to the original simdjson implementation on some CPUs
17+
* Up to 22 times faster than the std library on non-ASCII, up to three times faster on ASCII
18+
* As fast as or faster than the original simdjson implementation
1919
* Supports AVX 2 and SSE 4.2 implementations on x86 and x86-64. ARMv7 and ARMv8 neon support is planned
2020
* Selects the fastest implementation at runtime based on CPU support
2121
* Written in pure Rust
@@ -75,7 +75,7 @@ For no-std support (compiled with `--no-default-features`) the implementation is
7575
the targeted CPU. Use `RUSTFLAGS="-C target-feature=+avx2"` for the AVX 2 implementation or `RUSTFLAGS="-C target-feature=+sse4.2"`
7676
for the SSE 4.2 implementation.
7777

78-
If you want to be able to call A SIMD implementation directly, use the `public_imp` feature flag. The validation
78+
If you want to be able to call a SIMD implementation directly, use the `public_imp` feature flag. The validation
7979
implementations are then accessible via `simdutf8::(basic|compat)::imp::x86::(avx2|sse42)::validate_utf8()`.
8080

8181
## When not to use
@@ -85,26 +85,34 @@ This library uses unsafe code which has not been battle-tested and should not (y
8585
This crate's minimum supported Rust version is 1.38.0.
8686

8787
## Benchmarks
88-
TBD!
8988
The benchmarks have been done with [criterion](https://bheisler.github.io/criterion.rs/book/index.html), the tables
9089
are created with [critcmp](https://github.com/BurntSushi/critcmp). Source code and data are in the
9190
[bench directory](https://github.com/rusticstuff/simdutf8/tree/main/bench).
9291

9392
The name schema is id-charset/size. _0-empty_ is the empty byte slice, _x-error/66536_ is a 64KiB slice where the very
9493
first character is invalid UTF-8. All benchmarks were run on a laptop with an Intel Core i7-10750H CPU (Comet Lake) on
95-
Windows with Rust 1.51.0. Library versions are simdutf8 v0.1.0 and simdjson v0.9.2.
94+
Windows with Rust 1.51.0 if not otherwise stated. Library versions are simdutf8 v0.1.1 and simdjson v0.9.2. When comparing
95+
with simdjson simdutf8 is compiled with `#inline(never)`.
9696

9797
### simdutf8 basic vs std library UTF-8 validation
98-
![critcmp stimdutf8 basic vs std lib](https://raw.githubusercontent.com/rusticstuff/simdutf8/main/img/basic-vs-std.png)
99-
simdutf8 performs better except for inputs ≤ 64 bytes.
98+
![critcmp stimdutf8 v0.1.1 basic vs std lib](https://user-images.githubusercontent.com/3736990/116121179-a8271f80-a6c0-11eb-9b2b-6233c3c824f2.png)
99+
simdutf8 performs better or as well as the std library.
100100

101-
### simdutf8 basic vs simdjson UTF-8 validation
102-
![critcmp st lib vs stimdutf8 basic](https://raw.githubusercontent.com/rusticstuff/simdutf8/main/img/basic-vs-simdjson.png)
103-
simdutf8 is faster than simdjson except for some crazy optimization by clang for the pure ASCII
104-
loop (to be investigated). simdjson is compiled using clang and gcc from MSYS.
101+
### simdutf8 basic vs simdjson UTF-8 validation on Intel Comet Lake
102+
![critcmp stimdutf8 v0.1.1 basic vs simdjson WSL](https://user-images.githubusercontent.com/3736990/116121748-38656480-a6c1-11eb-8cb4-385c7516a46a.png)
103+
simdutf8 beats simdjson on almost all inputs on this CPU. This benchmark is run on
104+
[WSL](https://docs.microsoft.com/en-us/windows/wsl/install-win10)
105+
since I could not get simdjson to reach maximum performance on Windows with any C++ toolchain (see also simdjson issues
106+
[847](https://github.com/simdjson/simdjson/issues/847) and [848](https://github.com/simdjson/simdjson/issues/848)).
107+
108+
### simdutf8 basic vs simdjson UTF-8 validation on AMD Zen 2
109+
![critcmp stimdutf8 v0.1.1 basic vs simdjson AMD Zen 2](https://user-images.githubusercontent.com/3736990/116122729-731bcc80-a6c2-11eb-82a5-6e297778a1c4.png)
110+
111+
On AMD Zen 2 aligning reads apparently does not matter at all. The extra step for aligning even hurts performance a bit around
112+
an input size of 4096.
105113

106114
### simdutf8 basic vs simdutf8 compat UTF-8 validation
107-
![critcmp st lib vs stimdutf8 basic](https://raw.githubusercontent.com/rusticstuff/simdutf8/main/img/basic-vs-compat.png)
115+
![image](https://user-images.githubusercontent.com/3736990/116122427-0dc7db80-a6c2-11eb-8434-f9879742d90d.png)
108116
There is a small performance penalty to continuously checking the error status while processing data, but detecting
109117
errors early provides a huge benefit for the _x-error/66536_ benchmark.
110118

0 commit comments

Comments
 (0)