Skip to content

Conversation

@cfriedt
Copy link
Member

@cfriedt cfriedt commented Apr 29, 2023

The test_pthread_descriptor_leak test was causing a kernel panic on some platforms. Initially, it was not clear why.

The usual cases were examined - race conditions, stack sizes, etc. Still no luck.

As it turns out, recycling a thread stack (or at least the pthread_attr_t) in-place does not work on some platforms, and we need to reinitialize the pthread_attr_t and set set the stack property again prior to calling pthread_create().

Fixes #56163

Tested with:

west build -p always -b qemu_cortex_r5 -t run tests/posix/common \
    --   -DCONFIG_TICKLESS_KERNEL=n -DCONFIG_NEWLIB_LIBC=y

The `test_pthread_descriptor_leak` test was causing a kernel
panic on some platforms. Initially, it was not clear why.

The usual cases were examined - race conditions, stack sizes,
etc. Still no luck.

As it turns out, recycling a thread stack (or at least the
`pthread_attr_t`) in-place does not work on some platforms,
and we need to reinitialize the `pthread_attr_t` and set
set the stack property again prior to calling
`pthread_create()`.

Signed-off-by: Christopher Friedt <[email protected]>
Rather than pass an variable address to a `void *` in
`pthread_join()` and do nothing with it, just pass `NULL`.

Signed-off-by: Christopher Friedt <[email protected]>
Copy link
Contributor

@keith-packard keith-packard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jgl-meta jgl-meta merged commit 12ed08a into zephyrproject-rtos:main Apr 29, 2023
@cfriedt cfriedt deleted the issues/56163/posix-common-derived-test-fails-on-some-qemu-platforms branch April 29, 2023 20:59
@cfriedt cfriedt added area: Tests Issues related to a particular existing or missing test backport v2.7-branch labels Apr 29, 2023
@cfriedt
Copy link
Member Author

cfriedt commented May 7, 2023

I just thought I should update this PR and say that the change that went in did not actually fix the problem.

There is a race condition in between pthread_create() and pthread_join() that seems to have been present for quite some time, and it was likely never hit because we did not have a test that stressed those two being called in quick succession.

PR #57637 has been drafted with a new testsuite that will allow us to verify the fix with more certainty.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: POSIX POSIX API Library area: Tests Issues related to a particular existing or missing test

Projects

None yet

Development

Successfully merging this pull request may close these issues.

posix: pthread: race condition between pthread_create() and pthread_join()

4 participants