Skip to content

Conversation

MarekPieta
Copy link
Contributor

The delay ATT sent callback until data transmission is done by controller feature introduces an extra net_buf reference which is removed on sent callback. Make sure to remove the reference on BLE disconnection to free the buffer as the registered sent callback will not be called.

Jira: NCSDK-35650

Copy link
Contributor

@alwa-nordic alwa-nordic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's revert the old no-up that's broken and reapply the whole fixed feature as one commit. This will help us keep no-ups manageable and help us verify the fix.

The fix seems dubious. We need to clarify some things. Why is there a && (buf->ref == 1)? And why does the comment say "might"?

The comment should also be rephrased to more clearly state where the reference was taken, and ideally what the reference is currently "attached to" in a way that makes it easy to show that getting rid of it here is correct.

@MarekPieta MarekPieta force-pushed the fix_att_sent_cb_after_tx branch 2 times, most recently from 7bea193 to 6791da9 Compare October 3, 2025 07:28
@MarekPieta MarekPieta requested a review from alwa-nordic October 3, 2025 07:57
…p_send_pdu"

This reverts commit 510e36c.

Signed-off-by: Marek Pieta <[email protected]>
… TX is done"

This reverts commit d1d9b2d.

Signed-off-by: Marek Pieta <[email protected]>
By default, the BLE stack calls sent callback for ATT data when the data
is passed to BLE controller for transmission. Enabling this Kconfig
option delays calling the sent callback until data transmission is
finished by BLE controller (the callback is delayed until receiving the
Number of Completed Packets HCI Event).

If the ATT sent callback is delayed until data transmission is done by
BLE controller, the transmitted buffer may have an additional reference.
The reference is used to extend lifetime of the net buffer until the
data transmission is confirmed by ACK of the remote.

Jira: NCSDK-27422
Jira: NCSDK-28624
Jira: NCSDK-35650

Signed-off-by: Marek Pieta <[email protected]>
@MarekPieta MarekPieta force-pushed the fix_att_sent_cb_after_tx branch from 6791da9 to 5d62bf5 Compare October 3, 2025 12:47
@MarekPieta
Copy link
Contributor Author

Pure rebase

Comment on lines +286 to +292
if (IS_ENABLED(CONFIG_BT_ATT_SENT_CB_AFTER_TX)) {
__ASSERT_NO_MSG(buf->ref <= 2);

if (buf->ref == 2) {
net_buf_unref(buf);
}
}
Copy link
Contributor

@alwa-nordic alwa-nordic Oct 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Detecting ATT PDUs by checking buf->ref == 2 is fragile because future changes might introduce other scenarios with extra references for different reasons. I'm not even sure that it's not the case today. Instead of using the ref count as a proxy, I suggest comparing the cb:

if (IS_ENABLED(CONFIG_BT_ATT_SENT_CB_AFTER_TX)) {
    bt_conn_tx_cb_t cb = closure_cb(buf->user_data);
    
    if (cb == chan_sent_cb) {
        void *user_data = closure_data(buf->user_data);
        /* Release the extra reference attached to user_data */
        cb(chan->conn, user_data, -ESHUTDOWN);
    }
}

But I realized it's a bug that we don't invoke the callbacks here. We should be invoking all the callbacks here just like in process_unack_tx(). I made a PR upstream to fix this just now:

zephyrproject-rtos/zephyr#97056

I suggest we cherry pick that general fix instead. Can you test that it works for you when it passes the upstream CI?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bt_l2cap_send_pdu and bt_l2cap_dyn_chan_send both seem to demand that ref counter is 1 for buffers placed on the TX queue. Also it seems that the net_buf_unref here is expected to remove the last reference (please correct me if I overlooked something).

Using the closure_cb as in the code snippet above would require to pass the chan_sent_cb function (through header or using extern keyword). The function would no longer be static for the att.c file. I tried to avoid that as it might not be a clean solution.

Invoking the sent callback here might help. I will test your commit locally and let you know if it works.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bt_l2cap_send_pdu and bt_l2cap_dyn_chan_send both seem to demand that ref counter is 1 for buffers placed on the TX queue. Also it seems that the net_buf_unref here is expected to remove the last reference (please correct me if I overlooked something).

That's just to protect against user error. We did not foresee that a stack-internal user would keep an extra reference and wait for the callback. We just saw no use for the extra reference and it is indicative of a leaked reference. The stack does not rely on this fact AFAIK. We can remove all these checks. Maybe we should.

Using the closure_cb as in the code snippet above would require to pass the chan_sent_cb function (through header or using extern keyword). The function would no longer be static for the att.c file. I tried to avoid that as it might not be a clean solution.

I prefer that. We are referring to ATT anyway (to explain the code), so including the header is just being explicit about the dependency.

Invoking the sent callback here might help. I will test your commit locally and let you know if it works.

Thanks! It passed CI checks. Invoking the sent callback is a generalization of my code snippet here, so I'm sure it will work. Let's just look out for unintentional breakage elsewhere.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I confirmed your fix from the upstream PR. As soon as the upstream PR is merged, I will update this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants