Profile-Guided Optimization (PGO) benchmark report

Hi!

As I have done many times before, I decided to test the Profile-Guided Optimization (PGO) technique to optimize the library performance. For reference, results for other projects are available at https://github.com/zamazan4ik/awesome-pgo . Since PGO helped a lot for many other libraries, I decided to apply it on `wildcard` to see if the performance win (or lose) can be achieved. Here are my benchmark results.

This information can be interesting for anyone who wants to achieve more performance with the library in their use cases.

## Test environment

* Fedora 40
* Linux kernel 6.10.7
* AMD Ryzen 9 5900x
* 48 Gib RAM
* SSD Samsung 980 Pro 2 Tib
* Compiler - Rustc 1.79.0
* `wildcard` version: `main` branch on commit `280062738de2d3bd9d02e402479352420fc43a54`
* Disabled Turbo boost

## Benchmark

For benchmark purposes, I use built-in into the project benchmarks. For PGO optimization I use [cargo-pgo](https://github.com/Kobzol/cargo-pgo) tool. Release bench results I got with `taskset -c 0 cargo bench --workspace --all-features` command. The PGO training phase is done with `taskset -c 0 cargo pgo bench -- --workspace --all-features`, PGO optimization phase - with `taskset -c 0 cargo pgo optimize bench -- --workspace --all-features`.

`taskset -c 0` is used for reducing the OS scheduler's influence on the results. All measurements are done on the same machine, with the same background "noise" (as much as I can guarantee).

## Results

I got the following results:

* Release: https://gist.github.com/zamazan4ik/49477a4a35928d55c1bb65e97906a3af
* PGO optimized compared to Release: https://gist.github.com/zamazan4ik/8abb076dc4584b4ea3e907848656193b
* (just for reference) PGO instrumented compared to Release: https://gist.github.com/zamazan4ik/32f7706d28e0b1d07fd80cda7f9707ac

According to the results, PGO measurably improves the libraries' performance in most of the cases. However, in some of them are degradations. I think this is due to some conflicts between workloads so shouldn't be considered as a no-go for PGO (IMHO) but more benchmarks are appreciated in this field.

## Further steps

I understand that the steps above can be time-consuming and hard to implement in practice. At the very least, the library's users can find this performance report and decide to enable PGO for their applications if they care about `wildcard` performance in their workloads. Maybe a small note somewhere in the documentation (the README file?) will be enough to raise awareness about this work.

Please don't treat the issue like an actual issue - it's just a benchmark report (since Discussions are disabled for the repo).

Thank you.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Profile-Guided Optimization (PGO) benchmark report #7

Test environment

Benchmark

Results

Further steps

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Profile-Guided Optimization (PGO) benchmark report #7

Description

Test environment

Benchmark

Results

Further steps

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions