net: coap: Pull various fixes to CoAP client #2205

SeppoTakalo · 2024-10-31T08:30:08Z

Pull fixes from upstream to coap_client used by nRF Cloud libraries.

Internal issue reported: NCSDK-29890

rlubos · 2024-10-31T10:53:29Z

I'm puzzled why wasn't the compliance issue reported in the upstream PR?

LINE_SPACING: Missing a blank line after declarations
File:tests/net/lib/coap_client/src/stubs.c

About the Commit tags failure, were the commits modified somehow? Note, we should not do that, if there are conflicts, pull dependencies.

rlubos · 2024-10-31T12:58:48Z

@carlescufi IMO this qualifies for an exception from the conflict-free cherry-pick rule, fixing dependencies would require to pull this tree-wide change, which does not apply cleanly on it's own:
zephyrproject-rtos/zephyr@c4803752a86
IMO pursuing that further is pointess, let's just ignore the check for this PR.

Constrained application can require to avoid waiting for response from a server. CoAP No-response option defined in RFC 7967 allows to do that by suppressing selected response types. Signed-off-by: Hubert Miś <[email protected]> (cherry picked from commit 70c9bf1)

This commit makes sure we continue to wait for extra confirmations even after the request is done so we can handle duplicate confirmations if any. Detailed description: rfc7252#section-4.5 specifies that: "The recipient SHOULD acknowledge each duplicate copy of a Confirmable message". So if, for example, the client sends to a multicast destination address, the server will get multiple requests and will confirm all of them. Without this commit, the client will set the request to done after receiving the first answer. From here the request object will be marked as free and the duplicate acknowledgements will stay buffered in the network stack. Once the client tries to send a new request, it will unbuffer those duplicate acknowledgements but now the request object is unallocated so the client won't be able to handle those acknowledgements as duplicates. It will instead treat it as an unexpected ACK. To work around this issue, rfc7252#section-4.8.2 states that: "EXCHANGE_LIFETIME is the time from starting to send a Confirmable message to the time when an acknowledgement is no longer expected, i.e., message-layer information about the message exchange can be purged." Keeping the request object allocated for EXCHANGE_LIFETIME ensures that duplicate acknowledgements can be handled accordingly. This commit adds a basic implementation of what is stated in the RFC. EXCHANGE_LIFETIME has been arbitrarily set to 3 * ACK_TIMEOUT which seems more reasonable than the 247 seconds stated in the RFC. Signed-off-by: Francois Gervais <[email protected]> (cherry picked from commit bfed8d0)

…llocated As confirmable requests will stay allocated for (3 * ACK_TIMEOUT), we need to adjust the timings so all requests are unallocated by the end of the test. Signed-off-by: Francois Gervais <[email protected]> (cherry picked from commit 13cad59)

Passing MACRO arguments that need expansion require _CONCAT. Signed-off-by: Pieter De Gendt <[email protected]> (cherry picked from commit 476c12f)

The goal of this commit is to reduce the overall test time by optimizing the time spent sleeping. However while doing so, a few glitches became apparent and those have been fixed as well. 1. The way the ZSOCK_POLLIN event is managed has been modified In most tests, the event would stick for the whole test and this would sometimes results in the client receiving more data than intended which in turn would results in various unexpected warnings and errors. The management of the ZSOCK_POLLIN event has now been mostly moved to the send/receive overrides which better represent how things would happen outside of the test scenario. For example, when sending data, if this send would normally result in receiving some response, the send function sets the ZSOCK_POLLIN event. Then when receiving, we clear the event as the data has been received. There are a few exceptions to this for cases where we need to simulate some specific abnormal behavior. 2. The `messages_needing_response` queue now includes a valid bit The test manages a queue of messages id to be able to respond to the correct id in the receive hooks. It was built in a way that the id 0 would be treated as invalid although it is a valid id. In practice, it would not be an issue in most cases however, it would result in an unexpected behavior if the `test_multiple_requests` test would be run first. To fix this issue in the simplest way, the queue has been changed to uint32_t which gives us 16 extra bits for the management of the queue, 1 of which is used to tell if the entry is valid or not. Signed-off-by: Francois Gervais <[email protected]> (cherry picked from commit 736344b)

…rable Extend the `coap_transmission_parameters` struct with the field `ack_random_percent`. This was the last remaining CoAP transmission parameter that was not configurable at runtime. Signed-off-by: Adrian Friedli <[email protected]> (cherry picked from commit 98289e5)

Before this patch, any unexpected socket error during poll (caused by LTE disconnects for instance) would lead to a infinite loop because the error state was never cleared, handled or even signaled to the user. Signed-off-by: Benjamin Lindqvist <[email protected]> (cherry picked from commit f8a7035)

Protect global list of clients with mutex. Signed-off-by: Seppo Takalo <[email protected]> (cherry picked from commit 1ea569d)

…ailed to send When transmission of first request fails, reset the internal request buffer as there is no ongoing CoAP transaction. Application can deal with the failure. Signed-off-by: Seppo Takalo <[email protected]> (cherry picked from commit 46b7c84)

…etrying CoAP msg Refactor the CoAP retry handling into the handle_poll() function, so that we only try to send retries if the socket reports POLLOUT. Also move the receiving into same loop, so when poll() reports POLLIN we recv() the message and handle it before proceeding to other sockets. Also fix tests to handle POLLOUT flag and add support for testing multiple clients. Signed-off-by: Seppo Takalo <[email protected]> (cherry picked from commit 4c6dd4c)

…ng loop Forward recv() errors to handle_poll(), so there is only one place to handle error codes. Signed-off-by: Seppo Takalo <[email protected]> (cherry picked from commit 6481b0e)

… send() failure If send() fails, we have not technically send the CoAP retry yet, so restore the same pending structure, so our timeouts and retry counters stay the same. This will trigger a retry next time the poll() return POLLOUT, so we know that we can send. Signed-off-by: Seppo Takalo <[email protected]> (cherry picked from commit 623a1ff)

…stead of flagging It is error prone to flag separate booleans, so try to use reset_internal_request() every time we release the internal request structure. Also refactor the reset_internal_request() so that we reset the timeout value so it does not trigger again. Signed-off-by: Seppo Takalo <[email protected]> (cherry picked from commit a14f083)

Fix handling of received CoAP reset. Signed-off-by: Seppo Takalo <[email protected]> (cherry picked from commit 1890dbd)

Add helper API to construct CoAP Reset message. Signed-off-by: Seppo Takalo <[email protected]> (cherry picked from commit e96e95b)

Response tokens are already compared in get_request_with_token(). Signed-off-by: Seppo Takalo <[email protected]> (cherry picked from commit 1dc2487)

When receiving unknown response, respond with CoAP Reset. Signed-off-by: Seppo Takalo <[email protected]> (cherry picked from commit 350d20e)

If we receive poll() error during a request, it must be forwarded to application. If instead receive poll() error later, it should not be forwarded as it might be result of application closing the socket. Signed-off-by: Seppo Takalo <[email protected]> (cherry picked from commit 9c9dc9f)

In receiving thread, continuing the loops is based on has_ongoing_exchanges() so it does not need atomic coap_client_recv_active variable. When idling, it wakes from semaphore. But there was potential deadlock when coap_client_schedule_poll() would not signal the semaphore, if atomic variable was already showing that it runs. Removing the atomic variable removes this deadlock. Signed-off-by: Seppo Takalo <[email protected]> (cherry picked from commit 1e5a537)

SeppoTakalo · 2024-11-01T10:20:29Z

Updated the PR.
Now contains only [nrf fromtree] as the upstream PR is now merged.

Also fixed some merging issues as I previously had some commits in wrong order. Should apply more cleanly now.

SeppoTakalo · 2024-11-01T10:23:01Z

a bit unfortunate that our CI runs those compliance checks that upsream did not. Now it complains about styling on commits that are already merged in.

rlubos · 2024-11-04T09:27:38Z

a bit unfortunate that our CI runs those compliance checks that upsream did not. Now it complains about styling on commits that are already merged in.

I guess there's no way around then other than taking this as is.

SeppoTakalo requested review from juhaylinen, plskeggs and rlubos October 31, 2024 08:30

This was referenced Oct 31, 2024

manifest: sdk-zephyr: net: coap: Pull various fixes to CoAP client nrfconnect/sdk-nrf#18413

Merged

manifest: Pull in coap_client fixes nrfconnect/sdk-nrf#18414

Closed

SeppoTakalo force-pushed the coap_client_nrf_fix branch 2 times, most recently from 135bbfb to 3def4ab Compare October 31, 2024 10:21

juhaylinen approved these changes Oct 31, 2024

View reviewed changes

SeppoTakalo force-pushed the coap_client_nrf_fix branch from 3def4ab to 08f0a94 Compare October 31, 2024 14:41

hubertmis and others added 19 commits November 1, 2024 12:02

[nrf fromtree] include: zephyr: net: coap_service: Use _CONCAT expansion

79cad38

Passing MACRO arguments that need expansion require _CONCAT. Signed-off-by: Pieter De Gendt <[email protected]> (cherry picked from commit 476c12f)

[nrf fromtree] net: lib: coap_client: Protect initialization with mutex

4f8da1b

Protect global list of clients with mutex. Signed-off-by: Seppo Takalo <[email protected]> (cherry picked from commit 1ea569d)

[nrf fromtree] net: lib: coap_client: Forward recv() errors to handli…

c7852c0

…ng loop Forward recv() errors to handle_poll(), so there is only one place to handle error codes. Signed-off-by: Seppo Takalo <[email protected]> (cherry picked from commit 6481b0e)

[nrf fromtree] net: lib: coap_client: Fix reset handling

f3ac92a

Fix handling of received CoAP reset. Signed-off-by: Seppo Takalo <[email protected]> (cherry picked from commit 1890dbd)

[nrf fromtree] net: coap: Add API to send reset message

a1dd4f6

Add helper API to construct CoAP Reset message. Signed-off-by: Seppo Takalo <[email protected]> (cherry picked from commit e96e95b)

[nrf fromtree] net: lib: coap_client: Remove duplicate token comparison

78c7c1d

Response tokens are already compared in get_request_with_token(). Signed-off-by: Seppo Takalo <[email protected]> (cherry picked from commit 1dc2487)

[nrf fromtree] net: lib: coap_client: Send RST for unknown queries

16a34d0

When receiving unknown response, respond with CoAP Reset. Signed-off-by: Seppo Takalo <[email protected]> (cherry picked from commit 350d20e)

SeppoTakalo force-pushed the coap_client_nrf_fix branch from 08f0a94 to 5c922fc Compare November 1, 2024 10:18

SeppoTakalo added this to the ncs-2.8.0 milestone Nov 1, 2024

rlubos approved these changes Nov 4, 2024

View reviewed changes

rlubos added the backport v3.7.99-ncs1-branch Relates to NCS v2.8-branch label Nov 4, 2024

rlubos merged commit 53b26fb into nrfconnect:main Nov 4, 2024
18 of 19 checks passed

NordicBuilder mentioned this pull request Nov 4, 2024

[Backport v3.7.99-ncs1-branch] net: coap: Pull various fixes to CoAP client #2220

Merged

SeppoTakalo deleted the coap_client_nrf_fix branch November 4, 2024 14:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

net: coap: Pull various fixes to CoAP client #2205

net: coap: Pull various fixes to CoAP client #2205

Uh oh!

SeppoTakalo commented Oct 31, 2024

Uh oh!

rlubos commented Oct 31, 2024

Uh oh!

rlubos commented Oct 31, 2024

Uh oh!

SeppoTakalo commented Nov 1, 2024

Uh oh!

SeppoTakalo commented Nov 1, 2024

Uh oh!

rlubos commented Nov 4, 2024

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

net: coap: Pull various fixes to CoAP client #2205

net: coap: Pull various fixes to CoAP client #2205

Uh oh!

Conversation

SeppoTakalo commented Oct 31, 2024

Uh oh!

rlubos commented Oct 31, 2024

Uh oh!

rlubos commented Oct 31, 2024

Uh oh!

SeppoTakalo commented Nov 1, 2024

Uh oh!

SeppoTakalo commented Nov 1, 2024

Uh oh!

rlubos commented Nov 4, 2024

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants