-
Notifications
You must be signed in to change notification settings - Fork 15.1k
Open
Description
#include <stdint.h>
uint8_t b(uint32_t in)
{
uint8_t ret = __builtin_clz(in) ^ 31;
return ret;
}
uint8_t c(uint32_t in)
{
uint8_t ret = __builtin_clz(in) ^ 31;
return ret + 1;
}Expected: Since b optimizes to a single bsr eax,edi, c should optimize to at most one instruction more.
Actual:
b(unsigned int):
bsr eax, edi
ret
c(unsigned int):
bsr ecx, edi
xor ecx, 31
mov al, 32
sub al, cl
ret
GCC gives good output (probably by being less clever about normalization), as does Clang if I add an extra optimization barrier. https://godbolt.org/z/3ansdnMxa