Skip to content

Initial GoLang Support#211

Merged
ashvardanian merged 7 commits intomainfrom
main-golang
Feb 23, 2025
Merged

Initial GoLang Support#211
ashvardanian merged 7 commits intomainfrom
main-golang

Conversation

@ashvardanian
Copy link
Owner

@ashvardanian ashvardanian commented Feb 23, 2025

Together with @MarkReedZ we've added basic GoLang bindings to StringZilla, which look surprisingly fast compared to native GoLang strings. We currently use the new cGo annotations available in Go 1.24:

Cgo has gained new capabilities in Go 1.24, supporting new C function annotations to improve runtime performance. Among them, #cgo noescape cFunctionName is used to inform the compiler that the memory passed to cFunctionname will not escape; #cgo nocallback cFunctionName indicates that this C function will not call back any Go functions. In addition, Cgo's inspection of multiple incompatible declarations of C functions has become more stringent. When there are incompatible declarations in different files, errors can be detected and reported more timely and accurately.

I was using an Intel Sapphire Rapids machine on AWS for preliminary testing and benchmarking. I've precompiled StringZilla with dynamic dispatch enabled, linked to the thin GoLang binding layer:

$ ~/StringZilla/golang$ CGO_CFLAGS="-I$(pwd)/../include" \
        CGO_LDFLAGS="-L$(pwd)/../build_golang -lstringzilla_shared" \
        LD_LIBRARY_PATH="$(pwd)/../build_golang:$LD_LIBRARY_PATH" \
        go run ../scripts/bench.go  --input ../leipzig1M.txt --split lines --seed 42

... and compared to native GoLang strings on some key operations:

Benchmarking on `../leipzig1M.txt` with seed 42.
Total input length: 129644797
Total lines: 1000000
Average line length: 128.64
Running benchmark using `testing.Benchmark`.
strings.Contains              :      309           3818144 ns/op
sz.Contains                   :      664           1881251 ns/op
strings.Index                 :      325           3669081 ns/op
sz.Index                      :      624           1990093 ns/op
strings.LastIndex             :       12          85201713 ns/op
sz.LastIndex                  :      494           2306318 ns/op
strings.IndexAny              :  6321228             181.0 ns/op
sz.IndexAny                   : 10608960             112.6 ns/op
strings.Count                 :      156           8015292 ns/op
sz.Count (non-overlap)        :      285           4206698 ns/op
sz.Count (overlap)            :      284           4204370 ns/op

So if you are processing a lot of text in Go, try doing so with StringZilla and stay tuned for the upcoming 4.0 release #201 🥳

@ashvardanian ashvardanian merged commit b818d1e into main Feb 23, 2025
7 checks passed
@ashvardanian ashvardanian deleted the main-golang branch February 23, 2025 16:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants