-
-
Notifications
You must be signed in to change notification settings - Fork 14.6k
Description
The Computer Language Benchmarks Game was on the Rust subreddit recently and while I checked out the numbers for Rust, I noticed that the Rust solution for the fasta benchmark is much slower than the C version, although they work fairly similarly, the multithreading in the C version is based on the Rust version. It turned out that the Rust benchmarks are compiled with LTO by default, and when I tested the code on my machine without LTO (both stable and nightly), it was almost as fast as the C version. I tried to find an existing issue, but most of them are about slow compilation, not slow runtime.
The interesting thing is that on the CPU monitor graph it's clearly visible that during the last part of the benchmark all CPU cores are only on 70% usage (so it's like a mutex is locked for too long). I also checked the binary size, it went down from 4.4 MB to 3.1 MB with LTO.
EDIT: I also tested it with the Mutex from parking_lot, it's still slow with LTO, but without it's a tiny bit faster than the C version.