Skip to content

Conversation

@ldorau
Copy link
Contributor

@ldorau ldorau commented Apr 14, 2025

Description

Fix create_slab() - set memory pointer after marking it inaccessible.

It is supposed to be a fix for the following data race:
https://github.com/ldorau/unified-memory-framework/actions/runs/14440708601/job/40490018121

WARNING: ThreadSanitizer: data race (pid=274219)
  Write of size 8 at 0x7f1a7201b268 by thread T10 (mutexes: write M0):
    #0 trackingAlloc /home/runner/work/unified-memory-framework/unified-memory-framework/src/provider/provider_tracking.c:474:11 (test_disjoint_pool+0x16375b) (BuildId: 5ab0ee73819682a89fed68fe86e9030a46e89243)
    #1 umfMemoryProviderAlloc /home/runner/work/unified-memory-framework/unified-memory-framework/src/memory_provider.c:245:9 (test_disjoint_pool+0x15ff4e) (BuildId: 5ab0ee73819682a89fed68fe86e9030a46e89243)
    #2 create_slab /home/runner/work/unified-memory-framework/unified-memory-framework/src/pool/pool_disjoint.c:116:11 (test_disjoint_pool+0x16ff06) (BuildId: 5ab0ee73819682a89fed68fe86e9030a46e89243)
    #3 bucket_create_slab /home/runner/work/unified-memory-framework/unified-memory-framework/src/pool/pool_disjoint.c:345:20 (test_disjoint_pool+0x16f5f8) (BuildId: 5ab0ee73819682a89fed68fe86e9030a46e89243)
    #4 bucket_get_avail_slab /home/runner/work/unified-memory-framework/unified-memory-framework/src/pool/pool_disjoint.c:368:9 (test_disjoint_pool+0x16f294) (BuildId: 5ab0ee73819682a89fed68fe86e9030a46e89243)
    #5 bucket_get_free_chunk /home/runner/work/unified-memory-framework/unified-memory-framework/src/pool/pool_disjoint.c:321:33 (test_disjoint_pool+0x16c6d5) (BuildId: 5ab0ee73819682a89fed68fe86e9030a46e89243)
    #6 disjoint_pool_aligned_malloc /home/runner/work/unified-memory-framework/unified-memory-framework/src/pool/pool_disjoint.c:763:11 (test_disjoint_pool+0x16c2a7) (BuildId: 5ab0ee73819682a89fed68fe86e9030a46e89243)
    #7 umfPoolAlignedMalloc /home/runner/work/unified-memory-framework/unified-memory-framework/src/memory_pool.c:195:12 (test_disjoint_pool+0x15f0a1) (BuildId: 5ab0ee73819682a89fed68fe86e9030a46e89243)
    #8 pow2AlignedAllocHelper(umf_memory_pool_t*) /home/runner/work/unified-memory-framework/unified-memory-framework/test/poolFixtures.hpp:216:25 (test_disjoint_pool+0xe6f40) (BuildId: 5ab0ee73819682a89fed68fe86e9030a46e89243)
    #9 umfPoolTest_multiThreadedpow2AlignedAlloc_Test::TestBody()::$_1::operator()(umf_memory_pool_t*) const /home/runner/work/unified-memory-framework/unified-memory-framework/test/poolFixtures.hpp:299:9 (test_disjoint_pool+0xfe801) (BuildId: 5ab0ee73819682a89fed68fe86e9030a46e89243)
    #10 void std::__invoke_impl<void, umfPoolTest_multiThreadedpow2AlignedAlloc_Test::TestBody()::$_1, umf_memory_pool_t*>(std::__invoke_other, umfPoolTest_multiThreadedpow2AlignedAlloc_Test::TestBody()::$_1&&, umf_memory_pool_t*&&) /usr/bin/../lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/invoke.h:61:14 (test_disjoint_pool+0xfe79a) (BuildId: 5ab0ee73819682a89fed68fe86e9030a46e89243)
    #11 std::__invoke_result<umfPoolTest_multiThreadedpow2AlignedAlloc_Test::TestBody()::$_1, umf_memory_pool_t*>::type std::__invoke<umfPoolTest_multiThreadedpow2AlignedAlloc_Test::TestBody()::$_1, umf_memory_pool_t*>(umfPoolTest_multiThreadedpow2AlignedAlloc_Test::TestBody()::$_1&&, umf_memory_pool_t*&&) /usr/bin/../lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/invoke.h:96:14 (test_disjoint_pool+0xfe66a) (BuildId: 5ab0ee73819682a89fed68fe86e9030a46e89243)
    #12 void std::thread::_Invoker<std::tuple<umfPoolTest_multiThreadedpow2AlignedAlloc_Test::TestBody()::$_1, umf_memory_pool_t*> >::_M_invoke<0ul, 1ul>(std::_Index_tuple<0ul, 1ul>) /usr/bin/../lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/std_thread.h:279:13 (test_disjoint_pool+0xfe5f2) (BuildId: 5ab0ee73819682a89fed68fe86e9030a46e89243)
    #13 std::thread::_Invoker<std::tuple<umfPoolTest_multiThreadedpow2AlignedAlloc_Test::TestBody()::$_1, umf_memory_pool_t*> >::operator()() /usr/bin/../lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/std_thread.h:286:11 (test_disjoint_pool+0xfe575) (BuildId: 5ab0ee73819682a89fed68fe86e9030a46e89243)
    #14 std::thread::_State_impl<std::thread::_Invoker<std::tuple<umfPoolTest_multiThreadedpow2AlignedAlloc_Test::TestBody()::$_1, umf_memory_pool_t*> > >::_M_run() /usr/bin/../lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/std_thread.h:231:13 (test_disjoint_pool+0xfe3c9) (BuildId: 5ab0ee73819682a89fed68fe86e9030a46e89243)
    #15 <null> <null> (libstdc++.so.6+0xdc252) (BuildId: e37fe1a879783838de78cbc8c80621fa685d58a2)

  Previous read of size 8 at 0x7f1a7201b268 by thread T7:
    #0 slab_get_end /home/runner/work/unified-memory-framework/unified-memory-framework/src/pool/pool_disjoint.c:179:38 (test_disjoint_pool+0x16cf99) (BuildId: 5ab0ee73819682a89fed68fe86e9030a46e89243)
    #1 disjoint_pool_free /home/runner/work/unified-memory-framework/unified-memory-framework/src/pool/pool_disjoint.c:843:32 (test_disjoint_pool+0x16d1c9) (BuildId: 5ab0ee73819682a89fed68fe86e9030a46e89243)
    #2 umfPoolFree /home/runner/work/unified-memory-framework/unified-memory-framework/src/memory_pool.c:215:12 (test_disjoint_pool+0x15eaa3) (BuildId: 5ab0ee73819682a89fed68fe86e9030a46e89243)
    #3 pow2AlignedAllocHelper(umf_memory_pool_t*) /home/runner/work/unified-memory-framework/unified-memory-framework/test/poolFixtures.hpp:224:39 (test_disjoint_pool+0xe73e2) (BuildId: 5ab0ee73819682a89fed68fe86e9030a46e89243)
    #4 umfPoolTest_multiThreadedpow2AlignedAlloc_Test::TestBody()::$_1::operator()(umf_memory_pool_t*) const /home/runner/work/unified-memory-framework/unified-memory-framework/test/poolFixtures.hpp:299:9 (test_disjoint_pool+0xfe801) (BuildId: 5ab0ee73819682a89fed68fe86e9030a46e89243)
    #5 void std::__invoke_impl<void, umfPoolTest_multiThreadedpow2AlignedAlloc_Test::TestBody()::$_1, umf_memory_pool_t*>(std::__invoke_other, umfPoolTest_multiThreadedpow2AlignedAlloc_Test::TestBody()::$_1&&, umf_memory_pool_t*&&) /usr/bin/../lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/invoke.h:61:14 (test_disjoint_pool+0xfe79a) (BuildId: 5ab0ee73819682a89fed68fe86e9030a46e89243)
    #6 std::__invoke_result<umfPoolTest_multiThreadedpow2AlignedAlloc_Test::TestBody()::$_1, umf_memory_pool_t*>::type std::__invoke<umfPoolTest_multiThreadedpow2AlignedAlloc_Test::TestBody()::$_1, umf_memory_pool_t*>(umfPoolTest_multiThreadedpow2AlignedAlloc_Test::TestBody()::$_1&&, umf_memory_pool_t*&&) /usr/bin/../lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/invoke.h:96:14 (test_disjoint_pool+0xfe66a) (BuildId: 5ab0ee73819682a89fed68fe86e9030a46e89243)
    #7 void std::thread::_Invoker<std::tuple<umfPoolTest_multiThreadedpow2AlignedAlloc_Test::TestBody()::$_1, umf_memory_pool_t*> >::_M_invoke<0ul, 1ul>(std::_Index_tuple<0ul, 1ul>) /usr/bin/../lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/std_thread.h:279:13 (test_disjoint_pool+0xfe5f2) (BuildId: 5ab0ee73819682a89fed68fe86e9030a46e89243)
    #8 std::thread::_Invoker<std::tuple<umfPoolTest_multiThreadedpow2AlignedAlloc_Test::TestBody()::$_1, umf_memory_pool_t*> >::operator()() /usr/bin/../lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/std_thread.h:286:11 (test_disjoint_pool+0xfe575) (BuildId: 5ab0ee73819682a89fed68fe86e9030a46e89243)
    #9 std::thread::_State_impl<std::thread::_Invoker<std::tuple<umfPoolTest_multiThreadedpow2AlignedAlloc_Test::TestBody()::$_1, umf_memory_pool_t*> > >::_M_run() /usr/bin/../lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/std_thread.h:231:13 (test_disjoint_pool+0xfe3c9) (BuildId: 5ab0ee73819682a89fed68fe86e9030a46e89243)
    #10 <null> <null> (libstdc++.so.6+0xdc252) (BuildId: e37fe1a879783838de78cbc8c80621fa685d58a2)

Checklist

  • Code compiles without errors locally
  • All tests pass locally
  • CI workflows execute properly

@ldorau ldorau requested a review from a team as a code owner April 14, 2025 11:13
@ldorau ldorau requested a review from bratpiorka April 14, 2025 11:18
utils_annotate_memory_inaccessible(slab->mem_ptr, slab->slab_size);
utils_annotate_memory_inaccessible(slab_mem_ptr, slab->slab_size);

utils_atomic_store_release_ptr(&slab->mem_ptr, slab_mem_ptr);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this fix. Should slab->mem_ptr always be accessed atomically? Why? This address isn't supposed to change during the lifetime of a slab.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated the description of this PR - you can find the full trace there:
It is supposed to be a fix for the following data race:
https://github.com/ldorau/unified-memory-framework/actions/runs/14440708601/job/40490018121

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know how this could fix the problem. In the logs, I can see that thread T7 is reading a value from a slab that is not fully created yet. This is strange as the slab is registered later in pool_register_slab() AFTER the create_slab() finishes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I will investigate it deeper

@ldorau ldorau marked this pull request as draft April 14, 2025 13:53
@ldorau ldorau closed this Apr 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants