Skip to content

Commit 88daf9c

Browse files
committed
retry
1 parent a157121 commit 88daf9c

File tree

1 file changed

+90
-0
lines changed

1 file changed

+90
-0
lines changed
Lines changed: 90 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,90 @@
1+
# Retry Loop Retry
2+
3+
Some time ago I lamented that I don't know how to write a retry loop such that:
4+
5+
- it is syntactically obvious that the amount of retries is bounded,
6+
- there's no spurious extra sleep after the last attempt,
7+
- the original error is reported if retrying fails,
8+
- there's no code duplication in the loop.
9+
10+
<https://matklad.github.io/2023/12/21/retry-loop.html>
11+
12+
To recap, we have
13+
14+
```zig
15+
fn action() E!T { ... }
16+
fn is_transient_error(err: E) bool { ... }
17+
```
18+
19+
and we need to write
20+
21+
```zig
22+
fn action_with_retries(retry_count: u32) E!T { ... }
23+
```
24+
25+
I've received many suggestions, and the best one was from
26+
[<https://www.joachimschipper.nl>,]{.display}
27+
though it was somewhat specific to Python:
28+
29+
```python
30+
for tries_left in reverse(range(retry_count)):
31+
try:
32+
return action()
33+
except Exception as e:
34+
if tries_left == 0 or not is_transient_error(e):
35+
raise
36+
sleep()
37+
else:
38+
assert False
39+
```
40+
41+
A couple of days ago I learned to think better about the problem. You see, the first requirement,
42+
that the number of retries is bounded syntactically, was leading me down the wrong path. If we
43+
_start_ with that requirement, we get code shape like:
44+
45+
```zig
46+
const result: E!T = for (0..retry_count) {
47+
// ???
48+
action()
49+
// ???
50+
}
51+
```
52+
53+
The salient point here is that, no matter what we do, we need to get `E` or `T` out as a result, so
54+
we'll have to call `action()` at least once. But `retry_count` _could_ be zero. Looking at the
55+
static semantics, any non `do while` loop's body can be skipped completely, so we'll have to have
56+
some runtime asserts explaining to the compiler that we really did run `action` at least once. The
57+
part of the loop which is guaranteed to be executed at least once is a condition. So it's more
58+
fruitful to flip this around: it's not that we are looping until we are out of attempts, but,
59+
rather, we are looping while the underlying action returns an error, and then retries are an extra
60+
condition to exit the loop early:
61+
62+
```zig
63+
var retries_left = retry_count;
64+
const result = try while(true) {
65+
const err = if (action()) |ok| break ok else |err| err;
66+
if (!is_transient_error(err)) break err;
67+
68+
if (retries_left == 0) break err;
69+
retries_left -= 1;
70+
sleep();
71+
};
72+
```
73+
74+
This shape of the loop also works if the condition for retries is not attempts based, but, say, time
75+
based. Sadly, this throws "loop is obviously bounded" requirement out of the window. But it can be
76+
restored by adding _upper bound_ to the infinite loop:
77+
78+
```zig
79+
var retries_left = retry_count;
80+
const result = try for(0..retry_count + 1) {
81+
const err = if (action()) |ok| break ok else |err| err;
82+
if (!is_transient_error(err)) break err;
83+
84+
if (retries_left == 0) break err;
85+
retries_left -= 1;
86+
sleep();
87+
} else @panic("runaway loop");
88+
```
89+
90+
I still don't like it (if you forget that `+1`, you'll get a panic!), but that's where I am at!

0 commit comments

Comments
 (0)