Skip to content

Conversation

@cfriedt
Copy link
Member

@cfriedt cfriedt commented May 7, 2023

Recently, a race condition was discovered in pthread_create() and pthread_join() that would trigger an assertion in kernel/sched.c of the form below.

aborted _current back from dead

This required a few changes to address thoroughly, but all indications point to a significant improvement in determinism and stability.

  • revert 77e3e8961d1162a839665aaa244c5c47189b896d
  • ensure final pthread state is PTHREAD_TERMINATED
  • add z_thread_wake_joiners() (c/o @andyross)
  • fix create/join race in thread lifecycle (c/o @andyross)
  • add pthread create / join stress test

Fixes #56163

Andy's contributions brought in from #57721 (I hope that's ok). Thanks boss 👍

Compliance issue is a false positive

@cfriedt cfriedt force-pushed the issues/56163/posix-common-derived-test-fails-on-some-qemu-platforms-when-the-CONFIG-TICKLESS-KERNEL-n-is-added branch 2 times, most recently from b196df9 to fdfcc7b Compare May 7, 2023 20:21
@cfriedt cfriedt force-pushed the issues/56163/posix-common-derived-test-fails-on-some-qemu-platforms-when-the-CONFIG-TICKLESS-KERNEL-n-is-added branch 2 times, most recently from 8cd587d to 219c4a3 Compare May 8, 2023 00:48
@cfriedt cfriedt force-pushed the issues/56163/posix-common-derived-test-fails-on-some-qemu-platforms-when-the-CONFIG-TICKLESS-KERNEL-n-is-added branch 4 times, most recently from 41a88c3 to 23aab13 Compare May 9, 2023 21:36
@cfriedt cfriedt force-pushed the issues/56163/posix-common-derived-test-fails-on-some-qemu-platforms-when-the-CONFIG-TICKLESS-KERNEL-n-is-added branch 2 times, most recently from 1abfd8f to b6e6752 Compare May 11, 2023 03:20
@cfriedt cfriedt changed the title tests: posix: stress test for pthread_create and pthread_join posix: pthread: correct race condition between pthread_create and pthread_join May 11, 2023
jgl-meta
jgl-meta previously approved these changes May 11, 2023
Copy link
Contributor

@jgl-meta jgl-meta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see anything obviously incorrect.

@cfriedt cfriedt force-pushed the issues/56163/posix-common-derived-test-fails-on-some-qemu-platforms-when-the-CONFIG-TICKLESS-KERNEL-n-is-added branch from b6e6752 to 5d61510 Compare May 12, 2023 02:06
@cfriedt cfriedt marked this pull request as ready for review May 12, 2023 02:12
@cfriedt cfriedt requested review from dcpleung and nashif as code owners May 12, 2023 02:13
@keith-packard
Copy link
Contributor

keith-packard commented May 19, 2023

The state transition diagram is super helpful, thanks for figuring out how to get that into this thread (and, we hope, into the Zephyr docs too!).

However, when looking at it, I'm confused as to why you think you need a new state here -- work() takes ABORTING to either EXITED or TERMINATED, and EXITED only goes to TERMINATED, so it seems like you could merge ABORTING and TERMINATED.

Maybe the actions that cause the state transitions need some additional annotation so we can understand the difference between the arcs with the same operation coming from the same node? For instance, TERMINATED to DETACHED/JOINABLE might be labeled as create(DETACHED)/create(JOINABLE)? I'd like to know how the two transitions from ABORTING are different.

I'm hesitant to encourage a move from a spinlock/mutex to a semaphore as semaphores are harder to reason about.

In order to fully eliminate races, we need to add another state (ABORTING below) that actually does the k_thread_abort() with the lock held in a separate work queue thread since letting go of the lock would necessarily introduce races. Lastly, I think there only needs to be a single lock in struct posix_thread that synchronizes both state and cancel state. More than one lock seems like overkill.

Holding the lock across thread transitions is kinda icky. The "usual" way to manage things like this is to simply re-validate the state in the second thread and work from there, potentially re-trying the operation, potentially performing some other operation.

@cfriedt
Copy link
Member Author

cfriedt commented May 19, 2023

However, when looking at it, I'm confused as to why you think you need a new state here -- work() takes ABORTING to either EXITED or TERMINATED, and EXITED only goes to TERMINATED, so it seems like you could merge ABORTING and TERMINATED.

Yeah, I think a flag would be appropriate (e.g. abort_pending), because otherwise we would need to remember the previous state, which is not as fun. We can represent a separate "state" when that flag is set.

Maybe the actions that cause the state transitions need some additional annotation so we can understand the difference between the arcs with the same operation coming from the same node? For instance, TERMINATED to DETACHED/JOINABLE might be labeled as create(DETACHED)/create(JOINABLE)? I'd like to know how the two transitions from ABORTING are different.

The create parameter was there originally, but I just thought it was obvious. Will update. Same with transitions from ABORTING - a thread only goes to EXITED if the previous state was JOINABLE.

I'm hesitant to encourage a move from a spinlock/mutex to a semaphore as semaphores are harder to reason about.

Hopefully I can keep it simple.

Holding the lock across thread transitions is kinda icky. The "usual" way to manage things like this is to simply re-validate the state in the second thread and work from there, potentially re-trying the operation, potentially performing some other operation

I'm not convinced that is sufficient to avoid race conditions and so far, my experiments show that hypothesis to be true.

@keith-packard
Copy link
Contributor

I'm not convinced that is sufficient to avoid race conditions and so far, my experiments show that hypothesis to be true.

Yeah, I suspect we'll have to see the implementation to get to ground truth here. Diagrams are all nice and everything, but they don't actually run on computers...

Christopher Friedt and others added 4 commits May 20, 2023 20:03
Previously, we considered `PTHREAD_EXITED` as a valid state
for reclaiming a thread. That was not the case.
`PTHREAD_EXITED` requires that `pthread_join()` is called while
the thread is in `PTHREAD_EXITED` state.

With that the calling thread may retrieve the return value of
the joining thread.

Additionally, correct incorrectly aliased pointer argument to
`k_thread_abort()`.

Signed-off-by: Christopher Friedt <[email protected]>
POSIX has a somewhat odd behavior that when you call
pthread_detach()[1], it is supposed to synchronously wake up anyone
currently joining the thread.  That isn't a Zephyr kernel behavior[2],
so it needs to be exposed.

The implementation is as simple as it sounds.  This is minimal: we
don't try to keep any state around to inform the k_thread_join() call
whether the thread actually exited or if this new function was used.
That has to be provided by the outer layers (pthreads already do
this).

[1] Which tells the OS/platform that it can clean the thread up as
soon as it exits and doesn't have to keep it around as a zombie until
someone calls pthread_join() on it.

[2] Because... why!?  Not needing to keep a thread around for joiners
isn't the same thing as disallowing joins!

Signed-off-by: Andy Ross <[email protected]>
The pthread_join() implementation was done using a condition variable
and not k_thread_join(), and that's actually fatal.  It's not possible
in Zephyr to guarantee a thread has exited in a race-free way without
calling either k_thread_abort() or k_thread_join().

This opened a hole where an app could call pthread_join(), which would
go to sleep waiting on the condition variable, which would be signaled
from the exit path of the thread, and then return to call
pthread_create() (and thus k_thread_create()) on the same pthread_t
object (and thus the same struct k_thread) BEFORE THE EARLIER THREAD
HAD ACTUALLY EXITED.

This makes the scheduler blow up, at least on SMP (I wasn't personally
able to make it happen on uniprocessor configs, though the race is
present always).

Rework to call k_thread_join().  There's no other correct way to do
this.

Fixes zephyrproject-rtos#56163

Signed-off-by: Andy Ross <[email protected]>
@cfriedt cfriedt force-pushed the issues/56163/posix-common-derived-test-fails-on-some-qemu-platforms-when-the-CONFIG-TICKLESS-KERNEL-n-is-added branch 2 times, most recently from 00d9170 to 1767a08 Compare May 21, 2023 12:18
@cfriedt
Copy link
Member Author

cfriedt commented May 21, 2023

@keith-packard , @andyross - some good / bad news. I improved my testsuite to also test raciness between k_thread_create() and k_thread_join().

The good news, is that pthreads are not alone in that they are racey 😀

The bad news, is that the same race exists today with k_threads 😞

Currently, testing pthreads is disabled by default. I also reduced the test duration from 60s to 10s (I think we actually have a bug in twister in terms of the timeout).

Any failures below in the pthread_pressure suite are exclusively with k_threads.

I was really hoping to get this fixed for v3.4.0 (and also v2.7.5) - but I'm a bit exhausted at the moment.

Sadly, the "aborted _current back from dead" issue exists with all libcs in various configurations and on multiple architectures.

twister -T tests/posix/pthread_pressure/
Renaming output directory to /home/cfriedt/zephyrproject/zephyr/twister-out.3
INFO    - Using Ninja..
INFO    - Zephyr version: zephyr-v3.3.0-4131-g1767a0844e0a
INFO    - Using 'zephyr' toolchain.
INFO    - Selecting default platforms per test case
INFO    - Building initial testsuite list...
INFO    - Writing JSON report /home/cfriedt/zephyrproject/zephyr/twister-out/testplan.json
INFO    - JOBS: 32
INFO    - Adding tasks to the queue...
INFO    - Added initial list of jobs to queue
ERROR   - qemu_x86_64               tests/posix/pthread_pressure/portability.posix.pthread_pressure.tls.newlib  FAILED : unexpected eof
ERROR   - see: /home/cfriedt/zephyrproject/zephyr/twister-out/qemu_x86_64/tests/posix/pthread_pressure/portability.posix.pthread_pressure.tls.newlib/handler.log
ERROR   - qemu_x86_64               tests/posix/pthread_pressure/portability.posix.pthread_pressure.picolibc  FAILED : unexpected eof
ERROR   - see: /home/cfriedt/zephyrproject/zephyr/twister-out/qemu_x86_64/tests/posix/pthread_pressure/portability.posix.pthread_pressure.picolibc/handler.log
ERROR   - qemu_cortex_a53_smp       tests/posix/pthread_pressure/portability.posix.pthread_pressure.picolibc  FAILED : unexpected eof
ERROR   - see: /home/cfriedt/zephyrproject/zephyr/twister-out/qemu_cortex_a53_smp/tests/posix/pthread_pressure/portability.posix.pthread_pressure.picolibc/handler.log
ERROR   - qemu_cortex_a53_smp       tests/posix/pthread_pressure/portability.posix.pthread_pressure.tls.newlib  FAILED : unexpected eof
ERROR   - see: /home/cfriedt/zephyrproject/zephyr/twister-out/qemu_cortex_a53_smp/tests/posix/pthread_pressure/portability.posix.pthread_pressure.tls.newlib/handler.log
ERROR   - qemu_riscv64_smp          tests/posix/pthread_pressure/portability.posix.pthread_pressure.tls.newlib  FAILED : Timeout
ERROR   - see: /home/cfriedt/zephyrproject/zephyr/twister-out/qemu_riscv64_smp/tests/posix/pthread_pressure/portability.posix.pthread_pressure.tls.newlib/handler.log
ERROR   - qemu_riscv32_smp          tests/posix/pthread_pressure/portability.posix.pthread_pressure.tls.newlib  FAILED : Timeout
ERROR   - see: /home/cfriedt/zephyrproject/zephyr/twister-out/qemu_riscv32_smp/tests/posix/pthread_pressure/portability.posix.pthread_pressure.tls.newlib/handler.log
ERROR   - qemu_cortex_a53_smp       tests/posix/pthread_pressure/portability.posix.pthread_pressure.tls  FAILED : unexpected eof
ERROR   - see: /home/cfriedt/zephyrproject/zephyr/twister-out/qemu_cortex_a53_smp/tests/posix/pthread_pressure/portability.posix.pthread_pressure.tls/handler.log
ERROR   - qemu_cortex_a53_smp       tests/posix/pthread_pressure/portability.posix.pthread_pressure.newlib  FAILED : unexpected eof
ERROR   - see: /home/cfriedt/zephyrproject/zephyr/twister-out/qemu_cortex_a53_smp/tests/posix/pthread_pressure/portability.posix.pthread_pressure.newlib/handler.log
ERROR   - qemu_x86_64               tests/posix/pthread_pressure/portability.posix.pthread_pressure  FAILED : unexpected eof
ERROR   - see: /home/cfriedt/zephyrproject/zephyr/twister-out/qemu_x86_64/tests/posix/pthread_pressure/portability.posix.pthread_pressure/handler.log
ERROR   - qemu_cortex_a53_smp       tests/posix/pthread_pressure/portability.posix.pthread_pressure  FAILED : unexpected eof
ERROR   - see: /home/cfriedt/zephyrproject/zephyr/twister-out/qemu_cortex_a53_smp/tests/posix/pthread_pressure/portability.posix.pthread_pressure/handler.log
ERROR   - qemu_riscv64_smp          tests/posix/pthread_pressure/portability.posix.pthread_pressure.picolibc  FAILED : Timeout
ERROR   - see: /home/cfriedt/zephyrproject/zephyr/twister-out/qemu_riscv64_smp/tests/posix/pthread_pressure/portability.posix.pthread_pressure.picolibc/handler.log
ERROR   - qemu_riscv32_smp          tests/posix/pthread_pressure/portability.posix.pthread_pressure.tls  FAILED : Timeout
ERROR   - see: /home/cfriedt/zephyrproject/zephyr/twister-out/qemu_riscv32_smp/tests/posix/pthread_pressure/portability.posix.pthread_pressure.tls/handler.log
ERROR   - qemu_riscv64_smp          tests/posix/pthread_pressure/portability.posix.pthread_pressure.newlib  FAILED : Timeout
ERROR   - see: /home/cfriedt/zephyrproject/zephyr/twister-out/qemu_riscv64_smp/tests/posix/pthread_pressure/portability.posix.pthread_pressure.newlib/handler.log
ERROR   - qemu_riscv32_smp          tests/posix/pthread_pressure/portability.posix.pthread_pressure  FAILED : Timeout
ERROR   - see: /home/cfriedt/zephyrproject/zephyr/twister-out/qemu_riscv32_smp/tests/posix/pthread_pressure/portability.posix.pthread_pressure/handler.log
ERROR   - qemu_riscv64_smp          tests/posix/pthread_pressure/portability.posix.pthread_pressure  FAILED : Timeout
ERROR   - see: /home/cfriedt/zephyrproject/zephyr/twister-out/qemu_riscv64_smp/tests/posix/pthread_pressure/portability.posix.pthread_pressure/handler.log
INFO    - Total complete:  275/ 275  100%  skipped:  171, failed:   15, error:    0
INFO    - 9 test scenarios (275 test instances) selected, 171 configurations skipped (163 by static filter, 8 at runtime).
INFO    - 89 of 275 test configurations passed (85.58%), 15 failed, 0 errored, 171 skipped with 0 warnings in 157.44 seconds
INFO    - In total 121 test cases were executed, 429 skipped on 40 out of total 571 platforms (7.01%)
INFO    - 102 test configurations executed on platforms, 2 test configurations were only built.
INFO    - Saving reports...
INFO    - Writing JSON report /home/cfriedt/zephyrproject/zephyr/twister-out/twister.json
INFO    - Writing xunit report /home/cfriedt/zephyrproject/zephyr/twister-out/twister.xml...
INFO    - Writing xunit report /home/cfriedt/zephyrproject/zephyr/twister-out/twister_report.xml...
INFO    - -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
INFO    - The following issues were found (showing the top 10 items):
INFO    - 1) tests/posix/pthread_pressure/portability.posix.pthread_pressure on qemu_cortex_a53_smp failed (unexpected eof)
INFO    - 2) tests/posix/pthread_pressure/portability.posix.pthread_pressure on qemu_riscv64_smp failed (Timeout)
INFO    - 3) tests/posix/pthread_pressure/portability.posix.pthread_pressure on qemu_riscv32_smp failed (Timeout)
INFO    - 4) tests/posix/pthread_pressure/portability.posix.pthread_pressure on qemu_x86_64 failed (unexpected eof)
INFO    - 5) tests/posix/pthread_pressure/portability.posix.pthread_pressure.newlib on qemu_cortex_a53_smp failed (unexpected eof)
INFO    - 6) tests/posix/pthread_pressure/portability.posix.pthread_pressure.newlib on qemu_riscv64_smp failed (Timeout)
INFO    - 7) tests/posix/pthread_pressure/portability.posix.pthread_pressure.tls on qemu_cortex_a53_smp failed (unexpected eof)
INFO    - 8) tests/posix/pthread_pressure/portability.posix.pthread_pressure.tls on qemu_riscv32_smp failed (Timeout)
INFO    - 9) tests/posix/pthread_pressure/portability.posix.pthread_pressure.tls.newlib on qemu_cortex_a53_smp failed (unexpected eof)
INFO    - 10) tests/posix/pthread_pressure/portability.posix.pthread_pressure.tls.newlib on qemu_riscv64_smp failed (Timeout)
INFO    - 
INFO    - To rerun the tests, call twister using the following commandline:
INFO    - west twister -p <PLATFORM> -s <TEST ID>, for example:
INFO    - 
INFO    - west twister -p qemu_riscv64_smp -s tests/posix/pthread_pressure/portability.posix.pthread_pressure.tls.newlib
INFO    - or with west:
INFO    - west build -p -b qemu_riscv64_smp -T tests/posix/pthread_pressure/portability.posix.pthread_pressure.tls.newlib
INFO    - -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
INFO    - Run completed
$ grep "back from dead" $(find twister-out/ -name handler.log)
twister-out/qemu_riscv32_smp/tests/posix/pthread_pressure/portability.posix.pthread_pressure.tls.newlib/handler.log:    aborted _current back from dead
twister-out/qemu_riscv32_smp/tests/posix/pthread_pressure/portability.posix.pthread_pressure/handler.log:       aborted _current back from dead
twister-out/qemu_x86_64/tests/posix/pthread_pressure/portability.posix.pthread_pressure.picolibc/handler.log:   aborted _current back from dead
twister-out/qemu_cortex_a53_smp/tests/posix/pthread_pressure/portability.posix.pthread_pressure/handler.log:    aborted _current back from dead
twister-out/qemu_riscv64_smp/tests/posix/pthread_pressure/portability.posix.pthread_pressure.tls.newlib/handler.log:    aborted _current back from dead
twister-out/qemu_riscv64_smp/tests/posix/pthread_pressure/portability.posix.pthread_pressure.picolibc/handler.log:      aborted _current back from dead

@keith-packard
Copy link
Contributor

A serious bummer. Sounds like we need to address the k_thread races before you can even begin to fix the pthreads issues (if any remain).

@cfriedt cfriedt force-pushed the issues/56163/posix-common-derived-test-fails-on-some-qemu-platforms-when-the-CONFIG-TICKLESS-KERNEL-n-is-added branch 2 times, most recently from 193b919 to f124514 Compare May 22, 2023 20:02
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, there's no protection here for using a thread index which is stale; checking only for one possible stale state (TERMINATED) provides a false sense of security. Makes me wonder if there should be some kind of validating mode for the API where a generation number is encoded in the upper bits of the thread ID that are validated against a current generation number stored in the thread?

Copy link
Member Author

@cfriedt cfriedt May 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the thought and it would at least replicate some of the other validation schemes in e.g. pthread_mutex.c.

To be honest, I think I should be able to transition pthreads to use <zephyr/sys/bitarray.h> as well.

Let me give it another spin.

Also, thanks @andyross and @npitre for validating the issue on aarch64. Will likely use an arch_exclude there if I am able to.

Will push in a moment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hrm. Is pthread_t at least as large as size_t and unisigned? Otherwise, I think to_posix_thread() needs some better bounds checking to avoid possible wrap issues (imagine a 32-bit pthread_t on a 64-bit size_t system, it's possible that a large address difference will wrap around back into range).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pthread_t is a uint32_t in Zephyr, so should be safe.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that makes it unsafe -- on 64 bit targets, size_t is 64 bits, so you will get aliasing when converting from size_t to pthread_t. Probably need to fix pthread_self to bounds check the computed pointer before converting to an array index with the subtraction.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True - for some reason I was thinking it was 32-bit. I'll add a bounds check in this rev

@cfriedt cfriedt force-pushed the issues/56163/posix-common-derived-test-fails-on-some-qemu-platforms-when-the-CONFIG-TICKLESS-KERNEL-n-is-added branch from f124514 to 26b9337 Compare May 22, 2023 20:44
@andyross
Copy link
Contributor

Repeating the comment from elsewhere, regarding the kernel bug tracked in #58116 : FWIW: it seems not to be a kernel race in thread lifecycle, more like a arm64-specific race in context handling (and maybe x86_64, but with 1000x lower frequency).

FWIW: I'd treat this as two separate issues and merge the pthread fixes. If you need to tune the test[1] on arm64 for CI, that seems acceptable as long as we have a bug tracking the instability.

[1] SMP tests are already retried in CI because of unavoidable timer skew problems with scheduling "CPUs" onto host threads that can be preempted for 10ms+ in busy CI runs. (Which isn't what we're seeing here -- no timer interaction in this test!). The frequency I'm seeing probably wouldn't even show up above the noise floor.

Christopher Friedt added 4 commits May 23, 2023 23:47
Previously, the pthread implementation used a `pthread_mutex_t`
to synchronize thread state. This was required because a
`pthread_cond_t` was used to notify other pthreads that were
attempting to join or exit the same pthread.

With the previous removal of the `pthread_cond_t` from
`struct posix_thread`, and the addition of the
`z_thread_wake_joiners()` utility, there is no longer a need
to utilize a `pthread_mutex_t` to synchronize state.

Instead, use a `struct k_spinlock` for synchronization and
`z_thread_wake_joiners()` for notification.

Additionally, use a `bitarray` instead of the the
`pthread_pool_lock`, which makes `pthread.c` resource
allocation more like that of the other pthread objects.

Signed-off-by: Christopher Friedt <[email protected]>
Technically speaking, there should be a synchronization to ensure
that a spawned thread has started before `pthread_join()` is
called.

However, since we're recycling an existing no-op thread, just add
a small delay.

A separate test will be used to demonstrate stressing
`pthread_create()` and `pthread_join()` which will use a
synchronization primitive.

Signed-off-by: Christopher Friedt <[email protected]>
Recently, a race condition was discovered in `pthread_create()`
and `pthread_join()` that would trigger an assertion in
`kernel/sched.c` of the form below.
```
aborted _current back from dead
```

Add a test that stresses the pthread implementation
in order to more easily reproduce the failure.

Signed-off-by: Christopher Friedt <[email protected]>
@cfriedt cfriedt force-pushed the issues/56163/posix-common-derived-test-fails-on-some-qemu-platforms-when-the-CONFIG-TICKLESS-KERNEL-n-is-added branch from 26b9337 to cd77548 Compare May 24, 2023 04:09
@cfriedt
Copy link
Member Author

cfriedt commented May 24, 2023

I'm going to close this PR, as somehow pthreads are less reliable than before in terms of passing Twister test runs, but I'll make a separate PR for the pthread_pressure test and pthread_descriptor_leak tests.

@cfriedt cfriedt closed this May 24, 2023
@cfriedt cfriedt deleted the issues/56163/posix-common-derived-test-fails-on-some-qemu-platforms-when-the-CONFIG-TICKLESS-KERNEL-n-is-added branch January 26, 2024 15:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

posix: pthread: race condition between pthread_create() and pthread_join()

8 participants