-
Notifications
You must be signed in to change notification settings - Fork 27
Description
Currently get_timeout(attempt, response) receives the attempt number and the response, but has no way to know how much wall-clock time has elapsed since the retry chain started. This makes it impossible to implement cumulative retry schedules where jitter naturally mean-reverts across attempts.
Motivation: mean-reverting jitter
JitterRetry adds random.uniform(0, ris) ** factor to each attempt's timeout independently. Over multiple retries, these jitter values compound — the total wait time drifts upward from the deterministic exponential schedule, with variance growing per attempt.
With access to elapsed time, a retry strategy could compute:
def get_timeout(self, attempt, response=None, start_time=None):
if start_time is None:
return super().get_timeout(attempt, response)
# Cumulative target: what the deterministic schedule says total wait should be by now
# (A closed-form geometric series exists but isn't worth the complexity for ~3-5 attempts)
cumulative_target = sum(
min(self._start_timeout * self._factor**i, self._max_timeout)
for i in range(attempt)
)
elapsed = time.monotonic() - start_time
remaining = max(0, cumulative_target - elapsed)
jitter = random.uniform(0, self._random_interval_size) ** self._factor
return min(remaining + jitter, self._max_timeout)If a previous retry slept longer due to jitter, remaining is smaller on the next attempt, so the next sleep is shorter — the jitter mean-reverts. The total retry chain duration stays close to the deterministic schedule regardless of jitter variance.
This isn't possible today because get_timeout can't see elapsed time, and _RequestContext._do_request() doesn't expose its timing to the retry strategy.
Proposed change
Add start_time to get_timeout():
# RetryOptionsBase
def get_timeout(
self,
attempt: int,
response: ClientResponse | None = None,
start_time: float | None = None,
) -> float:
...And in _RequestContext._do_request(), record start_time before the loop and pass it on each call:
async def _do_request(self) -> ClientResponse:
current_attempt = 0
start_time = time.monotonic()
while True:
...
retry_wait = self._retry_options.get_timeout(
attempt=current_attempt, response=response, start_time=start_time,
)The default start_time=None means existing get_timeout implementations that ignore the parameter continue to work without changes — same pattern as the response parameter added in v2.5.6.