Skip to content

clang misoptimization cached equality of thread local variable #111257

@kelbon

Description

@kelbon

commit where clang 18/19 fails: kelbon/kelcoro@541a84b

code:

  auto job_creator = [&](std::atomic<int32_t>& value) -> dd::job {
    auto th_id = std::this_thread::get_id();
    nocache(th_id);
    (void)co_await dd::jump_on(TP);
    if (th_id == std::this_thread::get_id())
      ++err_c;
    value.fetch_add(1, std::memory_order::release);
    if (value.load(std::memory_order::acquire) == 10)
      value.notify_one();
  };

TP - thread pool

My workaround (after this test works as expected):

[[gnu::noinline]] void nocache(auto&) {
}
auto job_creator = [&](std::atomic<int32_t>& value) -> dd::job {
    auto th_id = std::this_thread::get_id();
    nocache(th_id);
    (void)co_await dd::jump_on(TP);
    if (th_id == std::this_thread::get_id())
      ++err_c;
    value.fetch_add(1, std::memory_order::release);
    if (value.load(std::memory_order::acquire) == 10)
      value.notify_one();
};

So, it is not first issue with caching thread local variables in coroutines, but now i think clang does not cache value of variable (load address), but anyway optimization thinks, that 'th_id' is equal to result of get_id

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions