Skip to content

Commit acd43cb

Browse files
pabigotjukkar
authored andcommitted
net: timeout: refactor to fix multiple problems
The net_timeout structure is documented to exist because of behavior that is no longer true, i.e. that `k_delayed_work_submit()` supports only delays up to INT32_MAX milliseconds. Nonetheless, use of 32-bit timestamps within the work handlers mean the restriction is still present. This infrastructure is currently used for two timers with long durations: * address for IPv6 addresses * prefix for IPv6 prefixes The handling of rollover was subtly different between these: address wraps reset the start time while prefix wraps did not. The calculation of remaining time in ipv6_nbr was incorrect when the original requested time in seconds was a multiple of NET_TIMEOUT_MAX_VALUE: the remainder value would be zero while the wrap counter was positive, causing the calculation to indicate no time remained. The maximum value was set to allow a 100 ms latency between elapse of the deadline and assessment of a given timer, but detection of rollover assumed that the captured time in the work handler was precisely the expected deadline, which is unlikely to be true. Use of the shared system work queue also risks observed latency exceeding 100 ms. These calculations could produce delays to next event that exceeded the maximum delay, which introduced special cases. Refactor so all operations that use this structure are encapsulated into API that is documented and has a full-coverage unit test. Switch to the standard mechanism of detecting completed deadlines by calculating the signed difference between the deadline and the current time, which eliminates some special cases. Uniformly rely on the scanning the set of timers to determine the next deadline, rather than assuming that the most recent update is always next. Signed-off-by: Peter Bigot <[email protected]>
1 parent 9f95d8d commit acd43cb

File tree

11 files changed

+718
-223
lines changed

11 files changed

+718
-223
lines changed

doc/reference/networking/net_timeout.rst

Lines changed: 44 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -10,10 +10,50 @@ Network Timeout
1010
Overview
1111
********
1212

13-
The ``k_delayed_work`` API has a 1 ms accuracy for a timeout value,
14-
so the maximum timeout can be about 24 days. Some network timeouts
15-
are longer than this, so the net_timeout API provides a generic timeout
16-
mechanism that tracks such wraparounds and restarts the timeout as needed.
13+
Zephyr's network infrastructure mostly uses the millisecond-resolution uptime
14+
clock to track timeouts, with both deadlines and durations measured with
15+
32-bit unsigned values. The 32-bit value rolls over at 49 days 17 hours 2 minutes
16+
47.296 seconds.
17+
18+
Timeout processing is often affected by latency, so that the time at which the
19+
timeout is checked may be some time after it should have expired. Handling
20+
this correctly without arbitrary expectations of maximum latency requires that
21+
the maximum delay that can be directly represented be a 31-bit non-negative
22+
number (``INT32_MAX``), which overflows at 24 days 20 hours 31 minutes 23.648
23+
seconds.
24+
25+
Most network timeouts are shorter than the delay rollover, but a few protocols
26+
allow for delays that are represented as unsigned 32-bit values counting
27+
seconds, which corresponds to a 42-bit millisecond count.
28+
29+
The net_timeout API provides a generic timeout mechanism to correctly track
30+
the remaining time for these extended-duration timeouts.
31+
32+
Use
33+
***
34+
35+
The simplest use of this API is:
36+
37+
#. Configure a network timeout using :c:func:`net_timeout_set()`.
38+
#. Use :c:func:`net_timeout_evaluate()` to determine how long it is until the
39+
timeout occurs. Schedule a timeout to occur after this delay.
40+
#. When the timeout callback is invoked, use :c:func:`net_timeout_evaluate()`
41+
again to determine whether the timeout has completed, or whether there is
42+
additional time remaining. If the latter, reschedule the callback.
43+
#. While the timeout is running, use :c:func:`net_timeout_remaining` to get
44+
the number of seconds until the timeout expires. This may be used to
45+
explicitly update the timeout, which should be done by canceling any
46+
pending callback and restarting from step 1 with the new timeout.
47+
48+
The :c:struct:`net_timeout` contains a ``sys_snode_t`` that allows multiple
49+
timeout instances to be aggregated to share a single kernel timer element.
50+
The application must use :c:func:`net_timeout_evaluate()` on all instances to
51+
determine the next timeout event to occur.
52+
53+
:c:func:`net_timeout_deadline()` may be used to reconstruct the full-precision
54+
deadline of the timeout. This exists primarily for testing but may have use
55+
in some applications, as it does allow a millisecond-resolution calculation of
56+
remaining time.
1757

1858
API Reference
1959
*************

include/net/net_timeout.h

Lines changed: 114 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@
66

77
/*
88
* Copyright (c) 2018 Intel Corporation
9+
* Copyright (c) 2020 Nordic Semiconductor ASA
910
*
1011
* SPDX-License-Identifier: Apache-2.0
1112
*/
@@ -21,40 +22,137 @@
2122
*/
2223

2324
#include <string.h>
24-
#include <zephyr/types.h>
2525
#include <stdbool.h>
26+
#include <limits.h>
27+
#include <zephyr/types.h>
28+
#include <sys/slist.h>
2629

2730
#ifdef __cplusplus
2831
extern "C" {
2932
#endif
3033

31-
/** Let the max timeout be 100 ms lower because of
32-
* possible rounding in delayed work implementation.
34+
/** @brief Divisor used to support ms resolution timeouts.
35+
*
36+
* Because delays are processed in work queues which are not invoked
37+
* synchronously with clock changes we need to be able to detect timeouts
38+
* after they occur, which requires comparing "deadline" to "now" with enough
39+
* "slop" to handle any observable latency due to "now" advancing past
40+
* "deadline".
41+
*
42+
* The simplest solution is to use the native conversion of the well-defined
43+
* 32-bit unsigned difference to a 32-bit signed difference, which caps the
44+
* maximum delay at INT32_MAX. This is compatible with the standard mechanism
45+
* for detecting completion of deadlines that do not overflow their
46+
* representation.
3347
*/
34-
#define NET_TIMEOUT_MAX_VALUE ((uint32_t)(INT32_MAX - 100))
48+
#define NET_TIMEOUT_MAX_VALUE ((uint32_t)INT32_MAX)
3549

36-
/** Generic struct for handling network timeouts */
50+
/** Generic struct for handling network timeouts.
51+
*
52+
* Except for the linking node, all access to state from these objects must go
53+
* through the defined API.
54+
*/
3755
struct net_timeout {
38-
/** Used to track timers */
56+
/** Used to link multiple timeouts that share a common timer infrastructure.
57+
*
58+
* For examples a set of related timers may use a single delayed work
59+
* structure, which is always scheduled at the shortest time to a
60+
* timeout event.
61+
*/
3962
sys_snode_t node;
4063

41-
/** Address lifetime timer start time */
64+
/* Time at which the timer was last set.
65+
*
66+
* This usually corresponds to the low 32 bits of k_uptime_get(). */
4267
uint32_t timer_start;
4368

44-
/** Address lifetime timer timeout in milliseconds. Note that this
45-
* value is signed as k_delayed_work_submit() only supports signed
46-
* delay value.
69+
/* Portion of remaining timeout that does not exceed
70+
* NET_TIMEOUT_MAX_VALUE.
71+
*
72+
* This value is updated in parallel with timer_start and wrap_counter
73+
* by net_timeout_evaluate().
4774
*/
48-
int32_t timer_timeout;
75+
uint32_t timer_timeout;
4976

50-
/** Timer wrap count. Used if the timer timeout is larger than
51-
* about 24 days. The reason we need to track wrap arounds, is
52-
* that the timer timeout used in k_delayed_work_submit() is
53-
* 32-bit signed value and the resolution is 1ms.
77+
/* Timer wrap count.
78+
*
79+
* This tracks multiples of NET_TIMEOUT_MAX_VALUE milliseconds that
80+
* have yet to pass. It is also updated along with timer_start and
81+
* wrap_counter by net_timeout_evaluate().
5482
*/
55-
int32_t wrap_counter;
83+
uint32_t wrap_counter;
5684
};
5785

86+
/** @brief Configure a network timeout structure.
87+
*
88+
* @param timeout a pointer to the timeout state.
89+
*
90+
* @param lifetime the duration of the timeout in seconds.
91+
*
92+
* @param now the time at which the timeout started counting down, in
93+
* milliseconds. This is generally a captured value of k_uptime_get_32().
94+
*/
95+
void net_timeout_set(struct net_timeout *timeout,
96+
uint32_t lifetime,
97+
uint32_t now);
98+
99+
/** @brief Return the 64-bit system time at which the timeout will complete.
100+
*
101+
* @note Correct behavior requires invocation of net_timeout_evaluate() at its
102+
* specified intervals.
103+
*
104+
* @param timeout state a pointer to the timeout state, initialized by
105+
* net_timeout_set() and maintained by net_timeout_evaluate().
106+
*
107+
* @param now the full-precision value of k_uptime_get() relative to which the
108+
* deadline will be calculated.
109+
*
110+
* @return the value of k_uptime_get() at which the timeout will expire.
111+
*/
112+
int64_t net_timeout_deadline(const struct net_timeout *timeout,
113+
int64_t now);
114+
115+
/** @brief Calculate the remaining time to the timeout in whole seconds.
116+
*
117+
* @note This function rounds the remaining time down, i.e. if the timeout
118+
* will occur in 3500 milliseconds the value 3 will be returned.
119+
*
120+
* @note Correct behavior requires invocation of net_timeout_evaluate() at its
121+
* specified intervals.
122+
*
123+
* @param timeout a pointer to the timeout state
124+
*
125+
* @param now the time relative to which the estimate of remaining time should
126+
* be calculated. This should be recently captured value from
127+
* k_uptime_get_32().
128+
*
129+
* @retval 0 if the timeout has completed.
130+
* @retval positive the remaining duration of the timeout, in seconds.
131+
*/
132+
uint32_t net_timeout_remaining(const struct net_timeout *timeout,
133+
uint32_t now);
134+
135+
/** @brief Update state to reflect elapsed time and get new delay.
136+
*
137+
* This function must be invoked periodically to (1) apply the effect of
138+
* elapsed time on what remains of a total delay that exceeded the maximum
139+
* representable delay, and (2) determine that either the timeout has
140+
* completed or that the infrastructure must wait a certain period before
141+
* checking again for completion.
142+
*
143+
* @param timeout a pointer to the timeout state
144+
*
145+
* @param now the time relative to which the estimate of remaining time should
146+
* be calculated. This should be recently captured value from
147+
* k_uptime_get_32().
148+
*
149+
* @retval 0 if the timeout has completed
150+
* @retval positive the maximum delay until the state of this timeout should
151+
* be re-evaluated, in milliseconds.
152+
*/
153+
uint32_t net_timeout_evaluate(struct net_timeout *timeout,
154+
uint32_t now);
155+
58156
#ifdef __cplusplus
59157
}
60158
#endif

subsys/net/ip/CMakeLists.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@ zephyr_library_compile_definitions_ifdef(
99
zephyr_library_sources(
1010
net_core.c
1111
net_if.c
12+
net_timeout.c
1213
utils.c
1314
)
1415

subsys/net/ip/ipv6_nbr.c

Lines changed: 1 addition & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -2122,26 +2122,9 @@ static inline void handle_prefix_onlink(struct net_pkt *pkt,
21222122

21232123
#define TWO_HOURS (2 * 60 * 60)
21242124

2125-
static uint32_t time_diff(uint32_t time1, uint32_t time2)
2126-
{
2127-
return (uint32_t)abs((int32_t)time1 - (int32_t)time2);
2128-
}
2129-
21302125
static inline uint32_t remaining_lifetime(struct net_if_addr *ifaddr)
21312126
{
2132-
uint64_t remaining;
2133-
2134-
if (ifaddr->lifetime.timer_timeout == 0) {
2135-
return 0;
2136-
}
2137-
2138-
remaining = (uint64_t)ifaddr->lifetime.timer_timeout +
2139-
(uint64_t)ifaddr->lifetime.wrap_counter *
2140-
(uint64_t)NET_TIMEOUT_MAX_VALUE -
2141-
(uint64_t)time_diff(k_uptime_get_32(),
2142-
ifaddr->lifetime.timer_start);
2143-
2144-
return (uint32_t)(remaining / MSEC_PER_SEC);
2127+
return net_timeout_remaining(&ifaddr->lifetime, k_uptime_get_32());
21452128
}
21462129

21472130
static inline void handle_prefix_autonomous(struct net_pkt *pkt,

0 commit comments

Comments
 (0)