You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add LSX accelerated implementation for source file analysis
This patch introduces an LSX-optimized version of `analyze_source_file`
for the `loongarch64` target. Similar to existing SSE2 implementation
for x86, this version:
- Processes 16-byte chunks at a time using LSX vector intrinsics.
- Quickly identifies newlines in ASCII-only chunks.
- Falls back to the generic implementation when multi-byte UTF-8
characters are detected or in the tail portion.
LoongArch64 generic:
```
test analyze_source_file::tests::bench_empty_text ... bench: 35.34 ns/iter (+/- 0.11)
test analyze_source_file::tests::bench_newlines_long ... bench: 2,007.86 ns/iter (+/- 24.88)
test analyze_source_file::tests::bench_newlines_short ... bench: 377.96 ns/iter (+/- 0.58)
```
LoongArch64 LSX:
```
test analyze_source_file::tests::bench_empty_text ... bench: 35.71 ns/iter (+/- 0.13)
test analyze_source_file::tests::bench_newlines_long ... bench: 579.77 ns/iter (+/- 1.28)
test analyze_source_file::tests::bench_newlines_short ... bench: 287.14 ns/iter (+/- 5.71)
```
0 commit comments