-
Notifications
You must be signed in to change notification settings - Fork 8.2k
Description
Describe the bug
While investigating #56163 and digging through pthreads, testing showed that k_thread_create() and k_thread_join() also exhibit a race condition when re-using the same struct k_threads over and over again.
It's not something regularly seen in production at the moment, and was only detected by accident. Originally reported here.
On the kernel side, this mainly seems to be an issue with smp platforms.
Please also mention any information which could help others to understand
the problem you're facing:
- What target platform are you using?
qemu_x86_64,qemu_cortex_a53_smp,qemu_riscv64_smp,qemu_riscv32_smp - What have you tried to diagnose or workaround this issue? Wrote a testsuite ([DNM]: tests: posix: stress test for pthread_create and pthread_join #58115). Note, pthreads are disabled in this suite currently, so all failures are
k_threadat the moment. It happens on all libc configurations and most smp platforms. - Is this a regression? Probably not although it's hard to say.
- ...
To Reproduce
Steps to reproduce the behavior:
twister -i -T tests/posix/pthread_pressure- see errors
Expected behavior
Tests passing ideally 100% of the time on all platforms.
Impact
It seems to be the opposite of what several contributors and maintainers expect, and is possibly just a corner case that did not receive a lot of traffic.
Logs and console output
E.g.
ERROR - *** Booting Zephyr OS build zephyr-v3.3.0-4124-gf67ff9c38640 ***
Running TESTSUITE pthread_pressure
===================================================================
START - test_k_thread_create_join
I: NUM_THREADS: 2
I: TEST_NUM_CPUS: 2
I: TEST_DURATION_S: 10
I: TEST_DELAY_US: 0
ASSERTION FAIL [0] @ WEST_TOPDIR/zephyr/kernel/sched.c:1785
aborted _current back from dead
E: a0: 0000000000000004 t0: 0000000000000000
E: a1: 00000000000006f9 t1: 0000000000000009
E: a2: 0000000080009d98 t2: 0000000000000000
E: a3: 0000000000000000 t3: 0000000000000001
E: a4: 0000000000000000 t4: 0000000000000023
E: a5: 0000000000000001 t5: 000000008000b2b0
E: a6: 0000000000000001 t6: 0000000080006514
E: a7: 0000000000000001
E: ra: 00000000800063c4
E: mepc: 00000000800017e0
E: mstatus: 0000000a00021880
E:
E: >>> ZEPHYR FATAL ERROR 4: Kernel panic on CPU 0
E: Current thread: (nil) (unknown)
E: Halting system
Environment (please complete the following information):
- OS: Linux
- Toolchain: Zephyr SDK v0.16.1
- Commit SHA or Version used: d01780f (main), v2.7.4