Bluetooth: Mesh: Provisioner closes link on failed #97478

ludvigsj · 2025-10-13T14:18:19Z

According to MshPrt 5.4.4, The Provisioner, upon receiving the Provisioning Failed PDU, shall assume that the provisioning failed and immediately disconnect the provisioning bearer.

Also changes the link close upon to success to use the prov_link_close helper function instead of doing it manually, as minor cleanup.

PavelVPV · 2025-10-13T18:49:46Z

The Also part should be moved into a separate commit. Otherwise looks fine.

According to MshPrt 5.4.4, The Provisioner, upon receiving the Provisioning Failed PDU, shall assume that the provisioning failed and immediately disconnect the provisioning bearer. Signed-off-by: Ludvig Jordet <[email protected]>

Changes the link close upon success to use the `prov_link_close` helper function instead of doing it manually, as minor cleanup. Signed-off-by: Ludvig Jordet <[email protected]>

subsys/bluetooth/mesh/provisioner.c

alxelax · 2025-10-14T07:52:46Z

subsys/bluetooth/mesh/provisioner.c

 static void prov_failed(const uint8_t *data)
 {
 	LOG_WRN("Error: 0x%02x", data[0]);
-	reset_state();


I didn't get why provisioner does not clear CDB, state flags and doesn't release Diffie-Hellman key pair upon receiving provisioning failed frame.
Seems incorrect to me.
It should be:

prov_fail(PROV_BEARER_LINK_STATUS_FAIL); reset_state();

If you look in the prov_link_closed callback, the reset is done there.

There is no guarantee this callback is called. For pb gatt this will never cause this callback at all.
Cleaning data in this callback is used mostly for unpredictable link closing like timeout Host for gatt communication and closing over API by application.

Even simple case with pd adv will cause zombie node in CDB and heap leakage over mbedtls once mesh cannot find free adv structure, that might happen at any time and depends on customer configuration.

reset_state will be called after link is closed.
prov_link_close -> bearer->link_close:

For PB_ADV:
prov_link_close:pb_adv.c -> send Link Close PDU -> buf_sent -> close_link -> role->link_closed -> prov_link_closed:provisioner.c -> reset_state.

For PB-GATT:
prov_link_close:pb_gatt.c -> bt_conn_disconnect -> gatt_disconnected:pb_gatt_srv.c -> bt_mesh_pb_gatt_close:pb_gatt.c -> link_closed -> role->link_closed -> prov_link_closed:provisioner.c -> reset_state.

Also, I'm pretty sure this change breaks cleaning provisioner that works above rpr clent. When it receives link report "Closed by device" it doesn't clear cdb and keys anymore

Pretty sure -> can you elaborate because I don't really see this. As I showed above, for PB-Adv, reset_state is eventually called after sending Link Close PDU. For PB-GATT, Link is a regular BLE link and thus closed through BLE API, once it is closed, reset_state is called. I don't see how it can not be called. The only difference between old and new behavior is message sending (in case of PB-Adv) and link close (in case of PB-GATT).

Also, PB-GATT is not supported by RPR Client at the moment.

You described correctly, but this is applicable only to rpr server. Further rpr server handles pb_link_closed that will initiate link report with status BT_MESH_RPR_ERR_LINK_CLOSED_BY_DEVICE (https://github.com/zephyrproject-rtos/zephyr/blob/main/subsys/bluetooth/mesh/rpr_srv.c#L478)
Then rpr client receives it and calls closed callback for its own provisioner: https://github.com/zephyrproject-rtos/zephyr/blob/main/subsys/bluetooth/mesh/rpr_cli.c#L74
that will cause calling the same code but already on rpr client side: https://github.com/zephyrproject-rtos/zephyr/blob/main/subsys/bluetooth/mesh/provisioner.c#L698-L706
you removed reset functionality here and rely on rpr client bearer callback that is called for some reason again after this PR (it seems weird by self). But follow further:
https://github.com/zephyrproject-rtos/zephyr/blob/main/subsys/bluetooth/mesh/rpr_cli.c#L742-L750

this implementation does not assume any calbacks at all. Neither about success nor fail.
Finally, provisioner based on RPR Client gets zombie node and lost keys.

You described correctly, but this is applicable only to rpr server

Why? provisioner.c runs on RPR Client, not RPR server.

you removed reset functionality here and rely on rpr client bearer callback that is called for some reason again after this PR (it seems weird by self).

This I don't get. This discussion points to reset_state -> prov_link_close change in prov_failed callback which is called when Provisioning Failed PDU is received. I don't understand your sentence really, what does that is called for some reason again after this PR mean (where is again)?

In case of RPR, when Provisiong Failed PDU is received, RPR Client will with this new change have this call flow:
prov_link_close:provisioner.c -> pb_link_close:rpr_cli.c -> link_close -> link_timeout / handle_link_status -> bearer.cb->link_closed which is prov_link_closed:prov.c -> role->link_closedwhich isprov_link_closed:provisioner.c->reset_state`.

And note, this PR aligns behavior with other cases where processing provisioning PDU ends up in failure: any pdu handler --(error)--> prov_fail() -> prov_link_close(). No orphaned CDB entries in those cases, right?

Even error handling of bt_conn_disconnect call is not a big problem as there is protocol timeout. It is another problem that bt_conn_disconnect error is not handled in the timeout handler either:

zephyr/subsys/bluetooth/mesh/pb_gatt.c

Lines 67 to 72 in 9c5325d

/* If connection failed or timeout, not allow establish connection */

if (IS_ENABLED(CONFIG_BT_MESH_PB_GATT_CLIENT) &&

atomic_test_bit(bt_mesh_prov_link.flags, PROVISIONER)) {

if (link.conn) {

(void)bt_conn_disconnect(link.conn,

BT_HCI_ERR_REMOTE_USER_TERM_CONN);

But IMHO, this is a different problem and can be addressed in a separate task.

I gave up to argue about this. I still think it is not a good idea to rely on Host reliability to clean up internal mesh resources. Will Host call it in 100% cases or will not? I do not know.

Will Host call it in 100% cases or will not? I do not know.

It is guaranteed by Host to call disconnected callback, yes:

zephyr/include/zephyr/bluetooth/conn.h

Lines 1855 to 1856 in aaf3914

* This callback notifies the application that a connection

* has been disconnected.

If this doesn't work, then it is Host bug and should be fixed in Host, not in Mesh.

I agree with the point of that we should check the return value of bt_conn_disconnect. But still, IMO this can be done separately as it is neither handled here, nor in the timeout handler. I'd say it is up to @ludvigsj if @ludvigsj wants to fix this error handling in this PR or not. If not, lets just create a follow-up task.

sonarqubecloud · 2025-10-14T08:12:51Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

PavelVPV

Though this is not stated in contribution guidelines, it is generally better to start commit head with verb. Approving anyway.

alxelax · 2025-10-15T09:58:34Z

subsys/bluetooth/mesh/provisioner.c

 {
 	LOG_WRN("Error: 0x%02x", data[0]);
-	reset_state();
+	prov_link_close(PROV_BEARER_LINK_STATUS_FAIL);


I guess provisioner should analyze Error Code value before closing link.

5.4.4 ... When the Provisionee or the Provisioner receives a message with a field set to a value that is Prohibited or with a bit set to 1 within a bitfield indicated as Prohibited, the provisioning protocol shall fail and the message shall be treated as an error in the provisioning protocol. ...

If Error Code is not from Table 5.41 it should call error callback before closing link.

Ok, it will be prov_fail

zephyrbot added area: Bluetooth area: Bluetooth Mesh labels Oct 13, 2025

zephyrbot requested review from Andrewpini, HaavardRei, KyraLengfeld, LingaoM, PavelVPV, akredalen, alxelax, jhedberg and omkar3141 October 13, 2025 14:19

zephyrbot assigned PavelVPV Oct 13, 2025

ludvigsj added 2 commits October 14, 2025 09:43

Bluetooth: Mesh: Provisioner closes link on failed

2716bf9

According to MshPrt 5.4.4, The Provisioner, upon receiving the Provisioning Failed PDU, shall assume that the provisioning failed and immediately disconnect the provisioning bearer. Signed-off-by: Ludvig Jordet <[email protected]>

Bluetooth: Mesh: Minor cleanup of prov link close on success

a93a15a

Changes the link close upon success to use the `prov_link_close` helper function instead of doing it manually, as minor cleanup. Signed-off-by: Ludvig Jordet <[email protected]>

ludvigsj force-pushed the develop/prov_close_link_on_failed branch from 210402f to a93a15a Compare October 14, 2025 07:47

alxelax requested changes Oct 14, 2025

View reviewed changes

ludvigsj requested a review from alxelax October 14, 2025 08:29

PavelVPV approved these changes Oct 14, 2025

View reviewed changes

alxelax reviewed Oct 15, 2025

View reviewed changes

alxelax approved these changes Oct 15, 2025

View reviewed changes

cfriedt merged commit f998357 into zephyrproject-rtos:main Oct 15, 2025
29 checks passed

ludvigsj deleted the develop/prov_close_link_on_failed branch October 17, 2025 08:36

	/* If connection failed or timeout, not allow establish connection */
	if (IS_ENABLED(CONFIG_BT_MESH_PB_GATT_CLIENT) &&
	atomic_test_bit(bt_mesh_prov_link.flags, PROVISIONER)) {
	if (link.conn) {
	(void)bt_conn_disconnect(link.conn,
	BT_HCI_ERR_REMOTE_USER_TERM_CONN);

	* This callback notifies the application that a connection
	* has been disconnected.

Bluetooth: Mesh: Provisioner closes link on failed #97478

Bluetooth: Mesh: Provisioner closes link on failed #97478

Uh oh!

Conversation

ludvigsj commented Oct 13, 2025

Uh oh!

PavelVPV commented Oct 13, 2025

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alxelax Oct 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

PavelVPV Oct 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sonarqubecloud bot commented Oct 14, 2025

Quality Gate passed

Uh oh!

PavelVPV left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

alxelax Oct 14, 2025 •

edited

Loading

PavelVPV Oct 15, 2025 •

edited

Loading