Skip to content

Conversation

cvinayak
Copy link
Contributor

nRF Connect SDK v2.6.4 NCSIDB-1718 cherry-picks.

cvinayak and others added 11 commits September 30, 2025 13:54
nRF Connect SDK v2.6.4 NCSIDB-1718 cherry-picks.

Signed-off-by: Vinayak Kariappa Chettimada <[email protected]>
nRF Connect SDK v2.6.4 NCSIDB-1718 cherry-pick fixes.

Signed-off-by: Vinayak Kariappa Chettimada <[email protected]>
Fix l2cap error handling generally not properly disposing of tx buffers for
enhanced channels; Any callbacks have to be called and the
l2cap_tx_meta_data has to be freed

Signed-off-by: Troels Nilsson <[email protected]>
(cherry picked from commit f0032a3)
Signed-off-by: Vinayak Kariappa Chettimada <[email protected]>
It seems like a nice idea at first, but leads to hard-to-debug
situations for the application.

The previous behavior can be implemented by the app by defining
`alloc_seg` and allocating from the same pool as `buf`.

Signed-off-by: Jonathan Rico <[email protected]>
(cherry picked from commit 75c2aeb)
Signed-off-by: Vinayak Kariappa Chettimada <[email protected]>
We could start executing the work item after the channel has been
disconnected or destroyed, due to a race condition.

Double-check we are connected before attempting to send data.

Signed-off-by: Jonathan Rico <[email protected]>
(cherry picked from commit e436441)
Signed-off-by: Vinayak Kariappa Chettimada <[email protected]>
net_buf_alloc(K_FOREVER) can now fail (if run from the syswq). Propagate to
the caller instead of asserting.

Signed-off-by: Jonathan Rico <[email protected]>
(cherry picked from commit aa30d07)
Signed-off-by: Vinayak Kariappa Chettimada <[email protected]>
If bt_conn_set_state(conn, BT_CONN_DISCONNECTED) is called
while the connection is already disconnected, this triggers
a warning. This is likely to happen when bt_conn_cleanup_all
is called as part of bt_disable.

Added the state check to avoid unnecessary warnings in the log.

Signed-off-by: Emil Gydesen <[email protected]>
(cherry picked from commit 4a97746)
Signed-off-by: Vinayak Kariappa Chettimada <[email protected]>
The functionality is moved in preparation of the next commit which will
re-use this function from somewhere else.

Also add (default-on) asserts that we are able to allocate and send the
command. If that is not the case, we will leak buffers from the PoV of
the controller, leading to a stall in data transfer.

Depending on the error, we could probably recover using a disconnection.
For now, do the safe thing and stop the whole stack.

Signed-off-by: Jonathan Rico <[email protected]>
(cherry picked from commit 32212bf)
Signed-off-by: Vinayak Kariappa Chettimada <[email protected]>
The Softdevice Controller now sends the disconnect event only after
receiving all Host Num Completes for the packets it sent to the host.
This is done for security reasons.

In our current reassembly logic, it does not really matter when we
withhold the num complete.

Before this patch, it's the first fragment that is withheld, and after
the patch it will be the last fragment that is withheld until the host
is done processing.

The flow control properties are maintained, just in a different way.

Co-authored-by: Aleksander Wasaznik <[email protected]>
Signed-off-by: Jonathan Rico <[email protected]>
(cherry picked from commit 147ee3d)
Signed-off-by: Vinayak Kariappa Chettimada <[email protected]>
This function call frees the buffer kept by the host for reassembling L2CAP
PDUs into.

Without this call, the current buffer will eventually be
leaked, leading to a non-functional host due to lack of RX buffers.

The effect is worse when host flow control is not enabled, as the RX
buffer pool is shared with events, which means communication with the
controller is essentially dead.

Signed-off-by: Jonathan Rico <[email protected]>
(cherry picked from commit 8a2fe27)
Signed-off-by: Vinayak Kariappa Chettimada <[email protected]>
When disconnected only the first empty slot in the disconnected_handles
array should be updated.

Signed-off-by: Jens Rehhoff Thomsen <[email protected]>
(cherry picked from commit b478ffe)
Signed-off-by: Vinayak Kariappa Chettimada <[email protected]>
LingaoM and others added 8 commits October 2, 2025 13:13
Use actual user_data size not default by 8.

Signed-off-by: Lingao Meng <[email protected]>
(cherry picked from commit 0ddb6aa)
Signed-off-by: Vinayak Kariappa Chettimada <[email protected]>
`bt_buf_get_cmd_complete` is broken due to
zephyrproject-rtos/zephyr#64158, and fixing it
would require changing its signature and put even more complexity into
the HCI drivers, as it would require the drivers to perform an even
deeper peek into the event in order to observe the opcode.

Instead of the above, this patch removes the use of
`bt_buf_get_cmd_complete` and adds logic to allow the host to accept
command complete events in normal event buffers.

The above means performing a copy into the destination buffer, which is
the original command buffer. This is a small inefficiency for now, but
we should strive to redesign the host into a streaming architecture as
much as possible and handle events immediately instead of retaining
buffers.

This fixes zephyrproject-rtos/zephyr#64158:
Like all command completed events, the completion event for
`BT_HCI_OP_HOST_NUM_COMPLETED_PACKETS` is now placed in normal event
buffers. The the logic where the host discards this event is already
present. Since it's discarded, it will not interfere with the logic
around `bt_dev.cmd_send`.

Signed-off-by: Aleksander Wasaznik <[email protected]>
(cherry picked from commit 1cb83a8)
Signed-off-by: Vinayak Kariappa Chettimada <[email protected]>
After the previous commit, this function no longer has any users.

Signed-off-by: Aleksander Wasaznik <[email protected]>
(cherry picked from commit b6a1051)
Signed-off-by: Vinayak Kariappa Chettimada <[email protected]>
This is purely a syntactical refactor.

Signed-off-by: Aleksander Wasaznik <[email protected]>
(cherry picked from commit 9426309)
Signed-off-by: Vinayak Kariappa Chettimada <[email protected]>
Add 2 new Kconfig promptless options that are shorthand
for whether the ISO configuration can support RX and TX.

This also applies these new options as guards for existing
and missing code pieces.

Signed-off-by: Emil Gydesen <[email protected]>
(cherry picked from commit 9553347)
Signed-off-by: Vinayak Kariappa Chettimada <[email protected]>
Refactor only. The surrounding ifdefs are intentionally not changed in
this patch. They will be in the near future.

Rename the pool and generalize the documentation to allow using this
pool for other events that fit the same criteria. This pool can be used
for any buffer that is processed synchronously, without negatively
affecting 'num complete' messages. E.g. 'cmd complete/status' can be put
in this pool already.

We will be working towards making the host process all event buffers
synchronously. This is because events have no dedicated flow control,
and discarding events in the driver without informing the host creates
problems. Discarding should instead happen in the host higher layers
when unavoidable.

Signed-off-by: Aleksander Wasaznik <[email protected]>
(cherry picked from commit 2a7adae)
Signed-off-by: Vinayak Kariappa Chettimada <[email protected]>
`struct acl_data` is used even when Host flow control is not enabled.
It is written to through the `acl(buf)` accessor in `conn.c:hci_acl()`.

Hopefully no netbufs were harmed by that :/

Signed-off-by: Jonathan Rico <[email protected]>
(cherry picked from commit 792ae68)
Signed-off-by: Vinayak Kariappa Chettimada <[email protected]>
Why is it ok to use the sync pool?

Because command complete/status is processed in prio: that means on the
same stack as the `bt_recv()` call from the driver.

Why does it fix the issue?

Because the complete/status event goes into a pool that is guaranteed to
have one free buffer any time `bt_recv()` is not executing.

Since the driver is the one calling bt_recv(), it (hopefully) will
finish one `bt_recv()` before starting another one.

Fixes #78223

Co-authored-by: Aleksander Wasaznik <[email protected]>
Signed-off-by: Aleksander Wasaznik <[email protected]>
Signed-off-by: Jonathan Rico <[email protected]>
(cherry picked from commit 6d5cce6)
Signed-off-by: Vinayak Kariappa Chettimada <[email protected]>
@cvinayak cvinayak force-pushed the github_v3_5_99_ncs1_branch_ncsidb_1718 branch from 00e7362 to 2aedf96 Compare October 2, 2025 11:25
Because the number of ACL RX buffers must be at least the number of
maximum connections plus one, increasing `CONFIG_BT_MAX_CONN` could
inadvertently lead to a build failure if the number of ACL RX buffers is
not also increased. This dependency may not be obvious to users.

To address this issue, this commit deprecates the
`CONFIG_BT_BUF_RX_COUNT` Kconfig symbol and computes the value in
`buf.h` using the new `BT_BUF_RX_COUNT` define. Note that the default
value and the minimum range value have been changed to 0 to "disable"
the option.

Additionally, to allow users to increase the number of ACL RX buffers,
this commit introduces the new `CONFIG_BT_BUF_RX_COUNT_EXTRA` Kconfig
symbol. The value of this symbol will be added to the computed value of
`BT_BUF_RX_COUNT`.

The configurations of tests and samples have been updated to reflect
these changes.

Signed-off-by: Théo Battrel <[email protected]>
(cherry picked from commit 66ff97e)
Signed-off-by: Vinayak Kariappa Chettimada <[email protected]>
@cvinayak cvinayak force-pushed the github_v3_5_99_ncs1_branch_ncsidb_1718 branch 2 times, most recently from 073fa86 to bc7c04e Compare October 6, 2025 11:47
cvinayak and others added 3 commits October 6, 2025 16:24
Fix HCI command buffer allocation failure, that can cause
loss of Host Number of Completed Packets command.

Fail by rejecting the HCI Host Buffer Size command if the
required number of HCI command buffers are not allocated in
the Controller implementation.

When Controller to Host data flow control is supported in
the Controller only build, ensure that BT_BUF_CMD_TX_COUNT
is greater than or equal to (BT_BUF_RX_COUNT + Ncmd),
where Ncmd is supported maximum Num_HCI_Command_Packets in
the Controller implementation.

Relates to commit 8161430 ("Bluetooth: Add workaround
for no command buffer available")'.

Relates to commit 297f4f4 ("Bluetooth: Split HCI
command & event buffers to two pools").

Signed-off-by: Vinayak Kariappa Chettimada <[email protected]>
(cherry picked from commit d382fca)
Signed-off-by: Vinayak Kariappa Chettimada <[email protected]>
The Bluetooth data buffer API currently lacks a mechanism to notify when
a buffer is freed in the RX pool. This limitation forces HCI drivers to
adopt inefficient workarounds to manage buffer allocation.

HCI drivers face two suboptimal options:

- Blocking calls: Use bt_buf_get_rx with K_FOREVER, which blocks the
  execution context until a buffer becomes available.
- Polling: Repeatedly call bt_buf_get_rx with K_NO_WAIT, which increases
  CPU load and reduces efficiency.

This commit introduces a callback mechanism that is triggered each time
a buffer is freed in the RX pool. With this feature, HCI drivers can:

- Call bt_buf_get_rx with K_NO_WAIT.
- Wait for the callback notification if a NULL buffer is returned,
  avoiding unnecessary polling.

The new callback improves efficiency by enabling event-driven behavior
for buffer management, reducing CPU overhead while maintaining
responsiveness.

Signed-off-by: Pavel Vasilyev <[email protected]>
(cherry picked from commit c2488fd)
Signed-off-by: Vinayak Kariappa Chettimada <[email protected]>
The buffer allocation in conn.c will trigger warnings if we try to use
anything else than K_NO_WAIT for the timeout when called from within the
system workqueue.

The calls in l2cap.c and att.c which may pass non-zero timeouts already
have proper handling for failed allocations, so make sure we use K_NO_WAIT
to avoid unnecessary warnings from conn.c.

Signed-off-by: Johan Hedberg <[email protected]>
(cherry picked from commit 05b16b9)
Signed-off-by: Vinayak Kariappa Chettimada <[email protected]>
@cvinayak cvinayak force-pushed the github_v3_5_99_ncs1_branch_ncsidb_1718 branch from bc7c04e to b13497e Compare October 6, 2025 14:25
KyraLengfeld and others added 2 commits October 7, 2025 05:39
This commit alignes the timeout value for allocating buffers within att
on the BT RX thread, making it consistent within att.c, see
bt_att_req_alloc.

We are inferring in many bt_gatt_* functions that if called from a BT RX
thread (which is inherently the case if called from a callback when
running a Bluetooth application), we don't block and instead return
-ENOMEM when the ATT request queue is full, avoiding a deadlock.
This promise is fulfilled within bt_att_req_alloc, where the timeout for
allocation of the request slab is set to K_NO_WAIT if we are on the BT
RX thread. Unfortunately, we break this promise in
bt_att_chan_create_pdu, where the timeout for allocation of the att pool
is still K_FOREVER and deadlocks can (and do) occur when too many
requests are sent yet the pool is depleted.

Note: Both req_slab and att_pool sizes are defined by
CONFIG_BT_ATT_TX_COUNT. If applications start getting -ENOMEM with this
change, they were at risk of such a deadlock, and may increase
CONFIG_BT_ATT_TX_COUNT to allocate the att pool for their requests.

Note: This possible deadlock has been flying under the radar, as
att_pools are freed when the HCI driver has sent it to the controller
(instead of when receiving the response, as it happens with req_slabs)
and due to the att_pool and the req_slab being both sized by
CONFIG_BT_ATT_TX_COUNT, and req_slab being allocated before and
returning -ENOMEM already if there is no space, it takes a more specific
situation to deplete the att_pool but not the req_slab pool at this
point.

Note: Ideally, we don't want functions to behave differently depending
on which thread they are running, and while this commit makes it more
consistent, it should be considered a workaround solution.

Signed-off-by: Kyra Lengfeld <[email protected]>
(cherry picked from commit 6464ffa)
Signed-off-by: Vinayak Kariappa Chettimada <[email protected]>
This reverts commit 147ee3d.

Signed-off-by: Pavel Vasilyev <[email protected]>
(cherry picked from commit da9acbc)
Signed-off-by: Vinayak Kariappa Chettimada <[email protected]>
PavelVPV and others added 2 commits October 7, 2025 05:39
This reverts commit 32212bf.

Signed-off-by: Pavel Vasilyev <[email protected]>
(cherry picked from commit 971c2c9)
Signed-off-by: Vinayak Kariappa Chettimada <[email protected]>
nRF Connect SDK v2.6.4 NCSIDB-1718 cherry-pick revised.

Signed-off-by: Vinayak Kariappa Chettimada <[email protected]>
@cvinayak cvinayak force-pushed the github_v3_5_99_ncs1_branch_ncsidb_1718 branch from b13497e to 3abac90 Compare October 7, 2025 03:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.