You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While implementing a lock-free-queue I noticed that the pop function was about twice as slow on clang vs gcc. After digging through the assembly on compiler explorer and then reducing to a minimal example it seems that this is happening:
The mfence instruction is much slower, MSVC also generates lock inc DWORD PTR __Guard$1[esp+4] instead of an mfence. I raised this on the r/cpp a while ago and was referred to this GCC patch which introduced the optimisation. How can we go about getting something like this into llvm? I have been using boost atomic which seems to generate better assembly but, it would be really nice to drop the dependency.