- 
                Notifications
    You must be signed in to change notification settings 
- Fork 15k
Closed
Labels
Description
Dear Programmers of Clang,
#include <atomic>
int ButWeAcquire (const int *pt)
{
    const std::atomic <int> *const pat {static_cast <const std::atomic <int> *> (static_cast <const void *> (pt))};
    static_assert (sizeof *pat == sizeof *pt);
    const T t0 {*pt};
    const T t1 {*pat};
    const T t2 {*pt};
    return t0 + t1 - t2;
}Clang (but not GCC) seems to optimize away the last load operation (for t2):
https://godbolt.org/z/3resvraz9:
Compiling with -D'NDEBUG' -f'strict-aliasing' -O2 -g results in:
ButWeAcquire(int const*):
        mov     eax, dword ptr [rdi]
        mov     ecx, dword ptr [rdi]
        ret
or, for armv7-a:
ButWeAcquire(int const*):
        ldr     r1, [r0]
        ldr     r0, [r0]
        mov     r2, #0
        mov     r0, r1
        mcr     p15, #0, r2, c7, c10, #5
        bx      lr
or, for armv8-a:
ButWeAcquire(int const*):
        ldr     w8, [x0]
        ldar    wzr, [x0]
        mov     w0, w8
        ret
For me it would appear that the loading for t1 should Acquire effects by another thread which performs Release on the same memory address,
and those effects might include stores by that other thread to *pt (well, to the same memory address).
I am wrong somewhere, am I not ?
Maybe I should have written not "... Release on the same memory address", but rather "... Release on the same atomic object", and the atomic object is not shared with any other piece of code, please ?
Or is this really a bug ?
Thank you !!