Replies: 3 comments 3 replies
-
Hi @LingaoM, this is prohibited to use blocking API from interrupt handlers or kernel services (including syswork). Blocking API will stop an ongoing thread until something is not received or timeout expired. Considering that blocking API is prohibited to call from kernel services, BT Rx thread cannot preempt system work handler in the middle of execution since mesh is running in cooperative scheduling. I do not see the problem here. Probably, I do not understand the issue to full extent. Could you provide more detailed explanation if you still think this is an issue? |
Beta Was this translation helpful? Give feedback.
-
FYI @PavelVPV |
Beta Was this translation helpful? Give feedback.
-
I didn't spend much time on this but it seems to me you are right @LingaoM. I can't recall that we have any statement regarding using of the blocking API in the mesh callbacks. I don't see that it is prohibited to use blocking api in the system workqueue: https://docs.zephyrproject.org/latest/kernel/services/threads/workqueue.html#system-workqueue. And can't find any such statement regarding BT RX thread. The flash operations caused by the mesh stack are executed on the system workqueue (https://docs.zephyrproject.org/latest/connectivity/bluetooth/api/mesh/core.html#work-item-execution-context) when a separate thread is not used. And they are executed from separate work, not from a message handler. So probably the only case when this issue can happen is when a user does this in a model handler. Unless we yield somewhere in mesh code (but again, I didn't look much at the code). I need to think more about this, but seems allowing to enter the stack twice from different threads is not good and perhaps the solution should be somewhere in this area rather than doing something with |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
https://github.com/zephyrproject-rtos/zephyr/blob/main/subsys/bluetooth/mesh/transport.c#L1585
https://github.com/zephyrproject-rtos/zephyr/blob/main/subsys/bluetooth/mesh/transport.c#L1027
The Mesh protocol stack uses these static variables to cache messages, and then these messages are processed by the application layer. This does not seem to be a problem, because it seems that these messages are processed in BT RX, and the cooperative thread used by Zephyr can avoid competition arises.
But we ignored two points:
Mesh loopback messages are executed through the context of syswork, and the processing of messages into
model->recv
does not guarantee that the current task is always in the running state.Consider the following situation:
A certain message come from BT RX is processed at the application layer, but due to the execution of certain Block APIs, perhaps sem lock, perhaps k_sleep, or flash operation(https://github.com/zephyrproject-rtos/zephyr/blob/main/subsys/bluetooth/controller/flash/soc_flash_nrf_ticker.c#L225) etc., this will cause BT RX to temporarily lose the opportunity to run.
At this time, the message from a loopback is processed in
syswork
, at this timestatic buf
is accessed by two different tasks at the same time.Beta Was this translation helpful? Give feedback.
All reactions