-
Notifications
You must be signed in to change notification settings - Fork 25
Open
Description
The fundamental branchless swap_if code produces suboptimal code on x86-64. I ported it to Rust and noticed that changing it yielded a 50% performance uplift for that function on Zen3, this will of course depend on the the hardware, but cmov seems to yield better results than setl/setg style code that is currently being produced. Probably helped by doing 8 instead of 10 instructions.
Here is the current version:
And here is the version that produces cmov code:
- C https://godbolt.org/z/GrTvx1z8x (WIP, only good code gen for clang LLVM)
- Rust https://godbolt.org/z/9qnfY6h3v
I think if you can find a way to reliably produce cmov instructions like LLVM does, you should see a noticeable speed improvement.
Metadata
Metadata
Assignees
Labels
No labels