-
Notifications
You must be signed in to change notification settings - Fork 8k
Bluetooth: Host: Move tx_processor to bt_taskq #97288
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
alwa-nordic
wants to merge
6
commits into
zephyrproject-rtos:main
Choose a base branch
from
alwa-nordic:bt-taskq
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
+171
−10
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
8a8f17c
to
ade7122
Compare
There are other places in the Host that would make sense to run on bt_workq. This change exposes `bt_workq_chosen` in `hci_core.h` for use in other parts of the Host. `bt_workq_chosen` is set according to the `BT_RECV_CONTEXT` choice. Signed-off-by: Aleksander Wasaznik <[email protected]>
ATT is invoking user callbacks in its net_buf destroy function. It is common practice that these callbacks can block on bt_hci_cmd_alloc(), so we must consider these callbacks as blocking. This is a deadlock when the net_buf_unref() happens inside the HCI driver, driver invoked from tx_processor. Blocking callbacks like this appear in our own samples. See further down about how this problem was detected. Currently, tx_processor not protect against blocking callbacks so it is de-facto forbidden. The Host should not equip net_bufs with dangerous destroy callbacks. This commit makes ATT defer its net_buf destruction and user callback invocation to the system workqueue, so that net_buf_unref is safe to call from non-blocking threads. Unsafe code is banished to the system workqueue wild west. Future improvement may be to allow the user to provide their own workqueue for ATT callbacks. This deadlock was detected because the following test was failing while tx_processor to the bt_taskq: tests/bsim/bluetooth/ll/throughput/tests_scripts/gatt_write.sh The above test has an ATT callback `write_cmd_cb` invokes `bt_conn_le_param_update` can block waiting for `tx_processor`. Signed-off-by: Aleksander Wasaznik <[email protected]>
Reduce BT_MAX_CONN from 62 to 61 to make it build on integration platform qemu_cortex_m3/ti_lm3s6965 when we add bt_taskq in subsequent commit. Signed-off-by: Aleksander Wasaznik <[email protected]>
a9fc732
to
03c365b
Compare
please set a proper title |
Add a new workqueue bt_taskq specifically designed for quick non-blocking work items in the Bluetooth subsystem. This workqueue is always available and does not depend on any Kconfig option. Signed-off-by: Aleksander Wasaznik <[email protected]>
It's not safe for the tx_processor to share the system workqueue with work items that block the thread until tx_processor runs. This is a deadlock. The Bluetooth Host itself performs these operations, usually involving bt_hci_cmd_alloc(), on the system workqueue. This change effectively gives tx_processor its own thread, like the BT TX thread that used to exist. But, this time the thread is intended to be shared with any other non-blocking Bluetooth Host tasks. The bt_taskq rules tx_processor is supposed to be non-blocking and only have code under our control on the thread stack. Unfortunately, this is not entirely true currently. But we consider it close enough for now and will ensure it starts adhering to the rules in the future. Examples of problems: - The tx_processor invokes bt_hci_send(), driver code which has no rules limiting what it can do on our thread. - The tx_processor invokes net_buf_unref() on stack-external net_buf which executes user code on our thread. Signed-off-by: Aleksander Wasaznik <[email protected]>
This commit disables the deadlock workaround in bt_cmd_send_sync when it's not needed, when tx_processor runs on bt_taskq and not on system workqueue. Signed-off-by: Aleksander Wasaznik <[email protected]>
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR moves
tx_processor
off the system workqueue to a dedicated workqueue to prevent deadlocks. See commits for details.Core changes
Fallout fixes
tx_processor
torx_queue
to fix deadlockCleanups
bt_cmd_send_sync
workaround disabled when tx_processor uses dedicated thread