Skip to content

Avoid string copies.#17

Open
pgavlin wants to merge 1 commit intopetar-dambovaliev:masterfrom
pgavlin:unsafe-bytes
Open

Avoid string copies.#17
pgavlin wants to merge 1 commit intopetar-dambovaliev:masterfrom
pgavlin:unsafe-bytes

Conversation

@pgavlin
Copy link
Contributor

@pgavlin pgavlin commented May 6, 2025

Instead of copying strings to a byte slice in the implementation of Iter et. al., use unsafe to get at the strings' bytes. These byte slices are never mutated, so this is safe.

Instead of copying strings to a byte slice in the implementation of
Iter et. al., use unsafe to get at the strings' bytes. These byte slices
are never mutated, so this is safe.
@pgavlin
Copy link
Contributor Author

pgavlin commented May 6, 2025

benchstat output:

aho-corasick ❯ benchstat base.txt diff.txt 
goos: darwin
goarch: arm64
pkg: github.com/petar-dambovaliev/aho-corasick
cpu: Apple M4 Max
                                            │   base.txt   │              diff.txt               │
                                            │    sec/op    │   sec/op     vs base                │
Stdlib_StringsReplaceAll/No_matches-16        58.54µ ±  1%   58.49µ ± 1%        ~ (p=0.796 n=10)
Stdlib_StringsReplaceAll/Matches-16           52.24µ ±  1%   53.14µ ± 1%   +1.72% (p=0.002 n=10)
Stdlib_AhoCorasickReplaceAll/No_matches-16    46.15µ ±  2%   43.27µ ± 1%   -6.26% (p=0.000 n=10)
Stdlib_AhoCorasickReplaceAll/Matches-16       55.84µ ±  1%   49.72µ ± 1%  -10.96% (p=0.000 n=10)
AhoCorasick_ReplaceAllDFA-16                  53.12µ ± 18%   47.33µ ± 1%  -10.89% (p=0.000 n=10)
AhoCorasick_ReplaceAllNFA-16                  58.80µ ±  0%   53.02µ ± 1%   -9.84% (p=0.000 n=10)
AhoCorasick_LeftmostInsensitiveWholeWord-16   49.36µ ±  1%   46.24µ ± 1%   -6.32% (p=0.000 n=10)
geomean                                       53.26µ         49.95µ        -6.21%
                                            │   base.txt   │                diff.txt                │
                                            │     B/op     │     B/op      vs base                  │
Stdlib_StringsReplaceAll/No_matches-16        112.0Ki ± 0%   112.0Ki ± 0%        ~ (p=1.000 n=10)
Stdlib_StringsReplaceAll/Matches-16           112.0Ki ± 0%   112.0Ki ± 0%        ~ (p=1.000 n=10) ¹
Stdlib_AhoCorasickReplaceAll/No_matches-16    57456.0 ± 0%     112.0 ± 0%  -99.81% (p=0.000 n=10)
Stdlib_AhoCorasickReplaceAll/Matches-16       180.3Ki ± 0%   124.2Ki ± 0%  -31.09% (p=0.000 n=10)
AhoCorasick_ReplaceAllDFA-16                  181.7Ki ± 0%   125.5Ki ± 0%  -30.90% (p=0.000 n=10)
AhoCorasick_ReplaceAllNFA-16                  181.5Ki ± 0%   125.4Ki ± 0%  -30.92% (p=0.000 n=10)
AhoCorasick_LeftmostInsensitiveWholeWord-16   58240.0 ± 0%     872.0 ± 0%  -98.50% (p=0.000 n=10)
geomean                                       113.2Ki        21.73Ki       -80.81%
¹ all samples are equal
                                            │  base.txt  │               diff.txt               │
                                            │ allocs/op  │ allocs/op   vs base                  │
Stdlib_StringsReplaceAll/No_matches-16        3.000 ± 0%   3.000 ± 0%        ~ (p=1.000 n=10) ¹
Stdlib_StringsReplaceAll/Matches-16           3.000 ± 0%   3.000 ± 0%        ~ (p=1.000 n=10) ¹
Stdlib_AhoCorasickReplaceAll/No_matches-16    3.000 ± 0%   2.000 ± 0%  -33.33% (p=0.000 n=10)
Stdlib_AhoCorasickReplaceAll/Matches-16       23.00 ± 0%   22.00 ± 0%   -4.35% (p=0.000 n=10)
AhoCorasick_ReplaceAllDFA-16                  81.00 ± 0%   77.00 ± 0%   -4.94% (p=0.000 n=10)
AhoCorasick_ReplaceAllNFA-16                  64.00 ± 0%   60.00 ± 0%   -6.25% (p=0.000 n=10)
AhoCorasick_LeftmostInsensitiveWholeWord-16   21.00 ± 0%   19.00 ± 0%   -9.52% (p=0.000 n=10)
geomean                                       13.14        11.95        -9.07%
¹ all samples are equal

@petar-dambovaliev
Copy link
Owner

Hello. Thanks for the PR. I am hesitant in using unsafe here. I will need to think about it.

@petar-dambovaliev petar-dambovaliev self-assigned this Sep 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants