-
Notifications
You must be signed in to change notification settings - Fork 194
mutex: change the libmetal nuttx mutex to recursive mutex #352
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
To avoid the crash when lock are acquired twice, one case is in the destory process: we hold the rpdev->lock to iterate the rpdev->endpoints to destory all the endpoints, but rpmsg_destroy_ept() may send the name service message, and need acquire the rpdev->lock again to lead crash. Signed-off-by: Bowen Wang <wangbowen6@xiaomi.com>
|
Hello @CV-Bowen The PR seems to me ok. I’m just trying to determine whether the problem you are facing is related to the NuttX implementation or an issue within the OpenAMP library itself. Could you provide a few more details on the sequence that generates the issue? If I well understand you call rpmsg_deinit_vdev() while holding the rdev->lock, right? |
|
@arnopo Of course, like metal_mutex_acquire(&rdev->lock);
metal_list_for_each_safe(&rdev->endpoints, tmp, node)
{
ept = container_of(node, struct rpmsg_endpoint, node);
if (ept->ns_unbind_cb)
{
ept->ns_unbind_cb(ept);
}
else
{
rpmsg_destroy_ept(ept);
}
}
metal_mutex_release(&rdev->lock);As I described in the commit message, hold the rpdev->lock to iterate the rpdev->endpoints to destory all the endpoints, but rpmsg_destroy_ept() may send the name service message, and need acquire the rpdev->lock again, in NuttX, the mutex is different to the recursive mutex and will assert at the second time to acquire it. |
|
@arnopo This is the NuttX PR: apache/nuttx#17761, which has more detail information. |
|
Reusing rdev->lock in your application function does not seem to me like a good idea; Any reason to not use your own mutex. I also wonder if using a recursive mutex could potentially cause side effects on the rpmsg exchange as all mutexes are impacted. |
This change has been merged into the NuttX downstream without apparent controversy. Since this is a NuttX-specific change that should not affect other implementations, I think we can probably take this. |
Even if merged, that does not guarantee maturity. I prefer to challenge this to ensure it does not mask an issue that could resurface later. In my opinion, the issue lies in using rdev->lock in the application, which is an internal OpenAMP mutex. @xiaoxiang781216: Could you please confirm before I merge that you intend to use recursive mutexes for all OpenAMP mutexes to solve your issue? |
NuttX wrapper extend some functionality of OpenAMP, which require to hold the same lock for protection
metal_mutex is only used inside OpenAMP, so the impact is small. |
Thanks @xiaoxiang781216 for your confirmation. |
To avoid the crash when lock are acquired twice, one case is in the rpmsg destory process,
like
rpmsg_deinit_vdev()does, we implement a function to release all the endpoints attached to a rpmsg device:We hold the rpdev->lock to iterate the rpdev->endpoints to destory all the endpoints, but
rpmsg_destroy_ept()may send the name service message, and need acquire therpdev->lockagain, in NuttX, the mutex is different to the recursive mutex and will assert at the second time to acquire it.So I change the mutex to recursive version to cover this case.