Blutooth: Host: Update connection callbacks API documentation and minor fixes #89661

dottspina · 2025-05-08T12:43:13Z

This PR is mostly about updating the API documentation for some connection callbacks: commits titled "Amend xxx() callback documentation", we can also squash them once reviewed.

Additionally, I came across the use of a local variable as an atomic target in bt_conn_unref(): either I miss something obvious or it's typo in the source code, may be not deserving a dedicated PR or issue. Since this involves a Zephyr commit that we'll already mention, I've taken the liberty to discuss it briefly here as well.

EDIT: added a trivial commit to fix the MISRA "All if ... else if constructs shall be terminated
with an else clause" (c:M23_112) issues reported by SonarQube and blocking this PR.

Update connection callbacks API documentation

The API documentation for some connection callbacks appears to not reflect some changes in the Bluetooth Host stack [1] [2] [3].

bt_conn_cb.disconnected(): warns that the connection object is not actually reusable at this stage, but some of the advice given may be deprecated (auto-resume of advertising), and does not mention recycled() which IIUC is now the most reliable callback to track free connections and/or initiate operations that depend on them
bt_conn_cb.recycled(): still warns users to "treat this callback as an ISR", although it now runs on the system work-queue thread, as does disconnected()
bt_conn_cb.connected(): surprisingly, no special precautions are indicated, while it runs on the same thread that also invokes GATT callbacks such as bt_gatt_attr.read() or bt_gatt_attr.write() (usually the BT RX WQ thread)

The respective commit messages provide some further details.
My wording may not be ideal, edit requests are welcome.

Use of a local variable as an atomic target

While reading subsys/bluetooth/host/conn.c I came across something that I can't quite grasp, and I'm taking advantage of this PR to address it.

In bt_conn_unref(), the local, non shared, variable old is used as an atomic target:

atomic_val_t old = atomic_dec(&conn->ref);
conn = NULL;
bool deallocated = (atomic_get(&old) == 1);

I can't see what data race or value obsolescence the above call to atomic_get() could prevent ?
I considered that this might be a trick to force the compiler to generate some additional memory barrier, but I don't see the need for that either (if so, that might deserve a comment).

If pedantic, one could also note that atomic_get() expects an atomic_t* argument (called a target in Zephyr API), not an atomic_val_t* (value).
This compiles without even a warning since Zephyr Atomic defines both to be the same integer type.
The equivalent C11 code, where _Atomic(T) and T are different types, wouldn't compile.

Either I miss something obvious, or there's some kind of typo introduced with [1].
Please enlighten me ;-)

Thanks.

[1] efb5d83: Bluetooth: Host: Added Recycled evt notifying conn object is available
[2] 2ca59e7: Bluetooth: Host: Defer conn.recycled() to the syswq
[3] 8cfad44: Bluetooth: Deprecate adv auto-resume

jhedberg · 2025-05-09T07:22:36Z

include/zephyr/bluetooth/conn.h

+	 *  @note This callback will run on the same, non-preemptible, work-queue thread
+	 *        that processes incoming low priority HCI packets and invokes GATT callbacks
+	 *        such as @ref bt_gatt_attr.read and @ref bt_gatt_attr.write.
+	 *        Blocking operations are therefore discouraged.


We can add this, but it's going to be pretty short lived. I'm working on moving all host-internal processing to a single internal work queue (or rather renaming and extending the purpose of the existing rx workqueue), while at the same time deferring any application callbacks to the system workqueue. Once that change happens the above comment will not be correct anymore.

I'm working on moving all host-internal processing to a single internal work queue [...] while at the same time deferring any application callbacks to the system workqueue.

This will indeed be a welcome improvement: we'd better wait until you have made progress with this refactoring before updating the related API documentation (is there already an RFC or some draft ?).

(is there already an RFC or some draft ?)

Not yet. I have a bt_wq branch in my Zephyr fork, but it's still pretty messy. I'll create a draft PR once it's a bit cleaned up. Also, I don't think I can do all the wanted changes in one go. First step is to create the generic internal workqueue and move as much as possible of internal processing to it. Moving all callbacks then to the system workqueue will need to be a separate (later) step, since that will require some more intrusive changes.

subsys/bluetooth/host/conn.c

sonarqubecloud · 2025-05-12T15:10:12Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

dottspina · 2025-05-12T15:19:16Z

Sorry @Thalley , our messages from Friday crossed.
I think we all agree on preventing access to a possibly recycled connection object, I just reworded the comment which I too found misleading: either it is unnecessary, assuming that the purpose of the conn = NULL; statement is obvious, or we should be careful not to suggest that this statement is, in some opaque way, about the actual connection object.

@jhedberg , understood, move forward quietly, we keep my proposals for the moment, and we'll review the documentation of all related callbacks when you're ready.

I also added a trivial commit to fix the MISRA "All if ... else if constructs shall be terminated
with an else clause" (c:M23_112) issues reported by SonarQube and blocking this PR.

Thanks.

github-actions · 2025-07-12T00:40:44Z

This pull request has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this pull request will automatically be closed in 14 days. Note, that you can always re-open a closed pull request at any time.

include/zephyr/bluetooth/conn.h

Thalley · 2025-07-14T07:40:08Z

include/zephyr/bluetooth/conn.h

+	 * This callback runs on the system-worqueue thread,
+	 * the usual precautions apply.


It's tricky specifying the threads that certain callbacks come on, as that means that if the BT subsystem starts using it's own workqueue, then there aren't any real functional changes, but the documentation would be wrong.

Not 100% opposed to this, but not sure it should be added either

It's tricky specifying the threads that certain callbacks come on, as that means that if the BT subsystem starts using it's own workqueue, then there aren't any real functional changes, but the documentation would be wrong.

Absolutely right.

However:

today recycled() runs on the system work-queue: knowing this can help, e.g. to avoid inversion of priority issues

AFAIK @jhedberg is working on deferring all application callbacks to the system work-queue

Not 100% opposed to this, but not sure it should be added either

Not sure either.
My two cents: it's worth having documentation that's faithful to current behavior, to better guide users, and it will make it easier to update next time after @jhedberg's work is complete.

FWIW, my ongoing work is over here: #93033 however it's still in quite immature state, and I can't promise when it'll be stable enough to be promoted to a normal PR from a Draft.

include/zephyr/bluetooth/conn.h

subsys/bluetooth/host/conn.c

sonarqubecloud · 2025-08-08T18:09:30Z

Quality Gate passed

Issues
1 New issue
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

include/zephyr/bluetooth/conn.h

Thalley · 2025-08-11T08:10:11Z

include/zephyr/bluetooth/conn.h

+	 * This callback runs on the system workqueue thread,
+	 * the usual precautions apply.


What are the usual precautions? I think that if we want to write such as statement, we should define those and refer to it. If you don't know the usual precautions, then this isn't helpful :)

Totally agree, I too would like to be able to refer to some sort of summary of the recommendations and limitations for each of the system threads (e.g. syswq, BT RX WQ).

For the system workqueue, usual precautions might be:

avoid "blocking operations that would delay other system workqueue processing to an unacceptable degree", which is emphasized in Workqueue Threads, System Workqueue

inversion of priority issues, which are well explained in Workqueue Threads, Workqueue Best Practices

Perhaps the most coherent thing to do is to briefly redirect to the references above, rather than duplicating or paraphrasing the content. What do you think?

I think references to the 2 pages you linked makes sense if it can be done in a brief way.

One concern about this is that we have many callbacks that run in the system workqueue, and where they are not documented to be as such, and thus documenting just a few of these cases may actually be hurtful rather than helpful, as it somewhat implies that the other callbacks do not have these same precautions. (i.e. if there is a only a written "warning" for 2 out of 10 things, people will assume that the "warning" does not apply the to remaining 8 things).

If we move towards documenting this for a few callbacks here, then I think we should move towards doing it for all Bluetooth callbacks. What do you think?

Correct, I share your concerns: there are many callbacks, and documenting the execution context of only a few of them can be misleading. We should do that consistently, either documenting all callbacks that run on a specific thread (e.g. the system workqueue), or all callbacks within a subsystem/API.

We also need to find a way to avoid copy-pasting these warnings across the API documentation, perhaps using Doxygen aliases, text or even qualifiers like "isr-ok".

I think this deserves a distinct PR (may be an RFC), and thus removed all related changes from this one.

subsys/bluetooth/host/conn.c

Copilot

Pull Request Overview

This PR primarily updates the API documentation for Bluetooth connection callbacks to reflect recent changes in the Bluetooth Host stack, particularly around connection lifecycle management and threading context. The updates clarify when callbacks run, deprecate certain behaviors, and provide better guidance on connection management patterns.

Updates connection callback documentation to reflect current threading behavior and best practices
Fixes a subtle bug in bt_conn_unref() where a local variable was incorrectly used as an atomic target
Adds missing else clauses to fix MISRA compliance issues

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File	Description
`include/zephyr/bluetooth/conn.h`	Updates API documentation for connection callbacks with accurate threading context and usage guidance
`subsys/bluetooth/host/conn.c`	Fixes atomic variable usage bug, improves comments, and adds MISRA-compliant else clauses

Copilot · 2025-08-11T08:24:00Z

subsys/bluetooth/host/conn.c

 	IF_ENABLED(CONFIG_BT_CONN_TX,
-		   (__ASSERT(!(deallocated && k_work_is_pending(&conn->tx_complete_work)),
-			     "tx_complete_work is pending when conn is deallocated")));
+		   (bool conn_tx_is_pending = k_work_is_pending(&conn->tx_complete_work);));


The semicolon after the closing parenthesis creates an empty statement. The line should end with just )).

Suggested change

(bool conn_tx_is_pending = k_work_is_pending(&conn->tx_complete_work);));

(bool conn_tx_is_pending = k_work_is_pending(&conn->tx_complete_work)));

Copilot · 2025-08-11T08:24:01Z

subsys/bluetooth/host/conn.c

 	IF_ENABLED(CONFIG_BT_CONN_TX,
-		   (__ASSERT(!(deallocated && k_work_is_pending(&conn->tx_complete_work)),
-			     "tx_complete_work is pending when conn is deallocated")));
+		   (bool conn_tx_is_pending = k_work_is_pending(&conn->tx_complete_work);));


[nitpick] Variable declaration inside IF_ENABLED macro is unusual and reduces readability. Consider declaring conn_tx_is_pending before the IF_ENABLED block for better clarity.

Copilot · 2025-08-11T08:24:01Z

include/zephyr/bluetooth/conn.h

+	 *  a connection object has actually been freed.
+	 *
+	 * This callback runs on the system workqueue thread,
+	 * the usual precautions apply.


[nitpick] The documentation should be more specific about which precautions apply when running on the system workqueue thread. Consider referencing specific documentation or providing examples of the limitations.

Suggested change

* the usual precautions apply.

* This callback runs on the system workqueue thread.

* When running in this context, avoid blocking operations,

* long-running computations, or using APIs that are not safe

* to call from the system workqueue thread. For more details,

* see the Zephyr documentation on system workqueues:

* https://docs.zephyrproject.org/latest/kernel/services/threads/workqueue.html

I think the next version of the documentation will be fine and address most of the requested improvements and fixes.

An important question we have left to decide relates to "Bluetooth: Host: Fix use of local variable as atomic target", on which @Thalley and I differ (see #89661 (comment)).

I have a last concern, I'm not sure how to cleanly integrate the "tx_complete_work is pending when conn is deallocated" assertion:

to be consistent, we should not access conn->tx_complete_work after decrementing the reference count

whatever we do, we will end up with two IF_ENABLED(CONFIG_BT_CONN_TX,...) blocks, one before atomic_dec() to store some state, and another after with the assertion itself, which is already a bit ugly

the test can't happen within the __ASSERT() statement: when assertions are disabled, the underlying variable will be unused, possibly causing build failures depending on compiler flags (see CI failures); I couldn't figure out how to fix that elegantly, or at least without adding even more ugliness (e.g. a dummy statement using the unused variable outside of the assertion)

I will include what I think is the least ugly approach in the next version of this PR. In the meantime, does anyone have any ideas?

Thanks.

the test can't happen within the __ASSERT() statement: when assertions are disabled, the underlying variable will be unused, possibly causing build failures depending on compiler flags (see CI failures); I couldn't figure out how to fix that elegantly, or at least without adding even more ugliness (e.g. a dummy statement using the unused variable outside of the assertion)

Hmm, it's odd that we see a CI issue here because ASSERT should be enabled in those, and even then we have many other cases where local variables are used only for __ASSERT without causing CI issues (sonarqube and coverity sometimes complain about those though).

Some not-very-nice-looking solutions could be to modify the IF_ENABLED(CONFIG_BT_CONN_TX to also check CONFIG_ASSERT or to apply __maybe_unused to the local variable.

I ended up using a standard #if directive:

I could'nt figure out how to test both CONFIG_BT_CONN_TX and CONFIG_ASSERT in a single IF_ENABLED()

checkpatch will parse anything like IF_ENABLED(CONFIG_BT_CONN_TX, (if (IS_ENABLED(CONFIG_ASSERT)) {assignment})) as an assignment in if condition (error ASSIGN_IN_IF), plus various warnings depending on formatting

Applying __maybe_unused is then necessary since the preprocessor can actually remove all uses of the variable.

It's not that ugly when you read it, and we don't call k_work_is_pending() unnecessarily.

Do you have a better idea?
Thanks for your help.

In bt_conn_unref(), a local variable is used as atomic target: atomic_val_t old = atomic_dec(&conn->ref); /* Prevent from accessing connection object */ bool deallocated = (atomic_get(&old) == 1); Using atomic_get() to access a non-shared local variable cannot prevent any data race on that variable, and only causes confusion. Moreover, this call to atomic_get() is incorrect: the API expects an atomic_t* argument (target), not an atomic_val_t* (value). This compiles and /works/ only because Zephyr defines both to be the same integer type, and thus: atomic_get(&old) == old. The equivalent C11 code, where _Atomic(T) and T are different types, wouldn't compile. Signed-off-by: Christophe Dufaza <[email protected]>

An assertion in bt_conn_unref() accesses the connection's state after decrementing its reference count. This is not consistent since, if we removed the last reference, the Bluetooth Host stack may reuse the connection object before the assertion is checked. Instead, retrieve the connection property tested by the assertion before decrementing the counter, as we do for other properties. Simplify the code path by returning early when we did not remove the last reference. Remind that automatic advertiser resumption is deprecated. Signed-off-by: Christophe Dufaza <[email protected]>

The API documentation for the recycled() callback predates [1], and still warns users to "treat this callback as an ISR", although it now runs on the system workqueue thread, as does disconnected(). "Making Bluetooth API calls" to "re-start connectable advertising or scanning" should no longer be "strongly discouraged". On the contrary, we can emphasize that this is the right event to listen for to initiate operations that will try to re-allocate a freed connection object. Mention that BT_MAX_CONN configures the size of the connection pool. Refs: - [1] efb5d83: Bluetooth: Host: Defer `conn.recycled()` to the syswq Signed-off-by: Christophe Dufaza <[email protected]>

The API documentation for the disconnected() callback warns that the listener can't assume that the corresponding connection object has been freed and may me available to the application. The recommendations given to still start a new connection or connectable advertiser are outdated or misleading: - "start connectable advertising": the options that "will attempt to resume the advertiser under some conditions" are deprecated since Zephyr 4.0 (BT_LE_ADV_OPT_CONNECTABLE and related) - "using k_work_submit()": assuming everything will be fine when the work is actually processed is not reliable - "increase CONFIG_BT_MAX_CONN": setting BT_MAX_CONN to N+1 when planning N simultaneous connections is a work-around that users may have gotten used to (despite its footprint), but there is no longer any reason to advise it Stop documenting creating new connections or restarting advertising from the disconnected() callback and instead recommend relying on recycled() for these use cases. Signed-off-by: Christophe Dufaza <[email protected]>

Fix some "All if ... else if constructs shall be terminated with an else clause" (c:M23_112) issues reported by SonarQube. Signed-off-by: Christophe Dufaza <[email protected]>

sonarqubecloud · 2025-09-19T15:46:01Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

dottspina · 2025-09-19T17:54:49Z

@Thalley, thank you for your review.

This PR should now only contain changes we agree on:

In bt_conn_unref() (2 commits): fix use of local variable as atomic target, fix connection usage after decrementing its reference count
API documentation update (2 commits): clearly state that recycled() is now the event to listen for to start a new connection or connectable advertiser, do not recommend workarounds for disconnected() (confirmed by @alwa-nordic)

Other considerations that emerged from our discussion:

It is useful to know the execution context of a callback that one implements, and the API documentation could help with this. Perhaps it's enough to tell users which thread, and assume they do their homework to understand the implications. What to do and how (e.g. Doxygen aliases ?) is beyond the scope of this PR and may deserve an RFC (e.g. starting with the Bluetooth subsystem).
Automatic advertiser resumption is deprecated since Zephyr 4.0: I considered removing the relevant part from bt_conn_unref(), but it wouldn't be consistent if we didn't remove its support altogether (configuration options and related API). I simply added a comment so we don't forget, but it might be worth opening an RFC (AFAICT samples and tests have already been updated).

Thanks.

Thalley

LGTM

If you want to follow up with another PR, I noticed that our ADV API does not mentioned that it reserves a bt_conn object if starting connectable advertising, and that it is also not mentioned that if you stop connectable advertising, then the recycled callback is likewise called :)

github-actions bot added area: Bluetooth Host Bluetooth Host (excluding BR/EDR) area: Bluetooth labels May 8, 2025

github-actions bot requested review from alwa-nordic, cvinayak, hermabe, jhedberg, rugeGerritsen, sjanc, Thalley and theob-pro May 8, 2025 12:44

github-actions bot assigned jhedberg and alwa-nordic May 8, 2025

jhedberg reviewed May 9, 2025

View reviewed changes

jhedberg requested a review from ubieda May 9, 2025 07:22

jhedberg reviewed May 9, 2025

View reviewed changes

subsys/bluetooth/host/conn.c Outdated Show resolved Hide resolved

dottspina force-pushed the pr-upd-bt-conn-callbacks-doc branch from e85c7de to 791dd18 Compare May 12, 2025 14:39

dottspina changed the title ~~Update connection callbacks API documentation~~ Blutooth: Host: Update connection callbacks API documentation and minor fixes May 12, 2025

github-actions bot added the Stale label Jul 12, 2025

Thalley removed the Stale label Jul 14, 2025

Thalley reviewed Jul 14, 2025

View reviewed changes

dottspina force-pushed the pr-upd-bt-conn-callbacks-doc branch 2 times, most recently from 386320c to a6ee0ad Compare August 8, 2025 15:04

zephyrbot requested review from HaavardRei and PavelVPV August 8, 2025 15:05

dottspina force-pushed the pr-upd-bt-conn-callbacks-doc branch 2 times, most recently from b3c4dfb to aac405a Compare August 8, 2025 18:03

Thalley reviewed Aug 11, 2025

View reviewed changes

Thalley requested review from Copilot and removed request for theob-pro August 11, 2025 08:22

Copilot AI reviewed Aug 11, 2025

View reviewed changes

dottspina force-pushed the pr-upd-bt-conn-callbacks-doc branch 2 times, most recently from 1211868 to 365df2f Compare September 19, 2025 12:54

dottspina added 4 commits September 19, 2025 16:43

Bluetooth: Host: Fix some MISRA c:M23_112 warnings

48c5722

Fix some "All if ... else if constructs shall be terminated with an else clause" (c:M23_112) issues reported by SonarQube. Signed-off-by: Christophe Dufaza <[email protected]>

dottspina force-pushed the pr-upd-bt-conn-callbacks-doc branch from 365df2f to 48c5722 Compare September 19, 2025 15:29

jhedberg approved these changes Sep 22, 2025

View reviewed changes

Thalley approved these changes Sep 22, 2025

View reviewed changes

cfriedt merged commit 69cab3f into zephyrproject-rtos:main Sep 22, 2025
28 checks passed

dottspina deleted the pr-upd-bt-conn-callbacks-doc branch September 24, 2025 03:12

		* This callback runs on the system-worqueue thread,
		* the usual precautions apply.

		* This callback runs on the system workqueue thread,
		* the usual precautions apply.

	(bool conn_tx_is_pending = k_work_is_pending(&conn->tx_complete_work);));
	(bool conn_tx_is_pending = k_work_is_pending(&conn->tx_complete_work)));

-	 * the usual precautions apply.
+	 * This callback runs on the system workqueue thread.
+	 * When running in this context, avoid blocking operations,
+	 * long-running computations, or using APIs that are not safe
+	 * to call from the system workqueue thread. For more details,
+	 * see the Zephyr documentation on system workqueues:
+	 * https://docs.zephyrproject.org/latest/kernel/services/threads/workqueue.html

Blutooth: Host: Update connection callbacks API documentation and minor fixes #89661

Blutooth: Host: Update connection callbacks API documentation and minor fixes #89661

Uh oh!

Conversation

dottspina commented May 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Update connection callbacks API documentation

Use of a local variable as an atomic target

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jhedberg May 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sonarqubecloud bot commented May 12, 2025

Quality Gate passed

Uh oh!

dottspina commented May 12, 2025

Uh oh!

github-actions bot commented Jul 12, 2025

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dottspina Jul 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sonarqubecloud bot commented Aug 8, 2025

Quality Gate passed

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Aug 11, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Aug 11, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Aug 11, 2025

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sonarqubecloud bot commented Sep 19, 2025

Quality Gate passed

Uh oh!

dottspina commented Sep 19, 2025

Uh oh!

Thalley left a comment

Choose a reason for hiding this comment

Uh oh!

dottspina commented May 8, 2025 •

edited

Loading

jhedberg May 9, 2025 •

edited

Loading

dottspina Jul 21, 2025 •

edited

Loading