feat(fortuna): better retry mechanism #2780

m30m · 2025-06-12T23:29:55Z

Summary

If we deploy multiple instances of fortuna, keepers will compete to fulfil requests and only one of them will succeed making the callback. The other instances will continue retrying for 5 minutes and by default this can take up to 13 retries. Knowing that this will happen for every request, the RPC usage will increase substantially which is not acceptable.

This PR fixes this by:

Expose the underlying errors in an inspectable manner instead of putting everything inside anyhow
Expose an error mapper which can customise the error returned before retrying. Using this mechanism, we customise the retry_interval and the number of retries.

Another nice side-effect here is that we get better error msgs for explorer.

Rationale

To avoid excessive RPC usage

How has this been tested?

Current tests cover my changes
Added new tests
Manually tested the code

Ran two instances of Fortuna locally, created 10 requests on monad-testnet and verified the behavior.

vercel · 2025-06-12T23:29:59Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Comments	Updated (UTC)
api-reference	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	Jun 13, 2025 6:13pm
component-library	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	Jun 13, 2025 6:13pm
developer-hub	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	Jun 13, 2025 6:13pm
entropy-debugger	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	Jun 13, 2025 6:13pm
entropy-explorer	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	Jun 13, 2025 6:13pm
insights	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	Jun 13, 2025 6:13pm
proposals	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	Jun 13, 2025 6:13pm
staking	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	Jun 13, 2025 6:13pm

m30m · 2025-06-12T23:31:48Z

apps/fortuna/src/keeper/process_event.rs

                .get_request(event.provider_address, event.sequence_number)
                .await;

-            tracing::error!("Failed to process event: {:?}. Request: {:?}", e, req);


I moved this inside because it was creating a bunch of false alarms.

jayantk

please do fix the spacing on the retries but lgtm aside from that

jayantk · 2025-06-13T13:47:00Z

apps/fortuna/src/eth_utils/utils.rs

            )
-            .await
+            .await;
+            result.map_err(|e| error_mapper(num_retries, e))


(and then you don't need to pass error_mapper)

I didn't understand this comment

jayantk · 2025-06-13T13:47:43Z

apps/fortuna/src/eth_utils/utils.rs

    gas_limit: U256,
    escalation_policy: EscalationPolicy,
-) -> Result<SubmitTxResult> {
+    error_mapper: impl Fn(u64, backoff::Error<SubmitTxError<T>>) -> backoff::Error<SubmitTxError<T>>


can you leave a comment that this lets you customize the backoff behavior based on the error type? It's not obvious what you get from this at the moment

jayantk · 2025-06-13T13:50:06Z

apps/fortuna/src/keeper/process_event.rs

+                if 1 < num_retries && num_retries < 5 {
+                    return backoff::Error::Transient {
+                        err,
+                        retry_after: Some(Duration::from_secs(60)),


i think the spacing here needs to be a bit more granular. retry the first time after 5 seconds, then 10 seconds, then 60 seconds

these errors happen pretty frequently in the first 1-2 seconds because of RPC async issues. The current logic will significantly degrade the UX whenever this happens, because the callback will take 60 seconds now.

this already kicks in on the 3rd attempt, but I will also increase the delay on the first 2 attempts.

feat(fortuna): better retry mechanism

3e21508

m30m commented Jun 12, 2025

View reviewed changes

jayantk approved these changes Jun 13, 2025

View reviewed changes

fix

b5661b4

vercel bot deployed to Preview – component-library June 13, 2025 17:42 View deployment

vercel bot deployed to Preview – proposals June 13, 2025 17:43 View deployment

vercel bot deployed to Preview – entropy-explorer June 13, 2025 17:43 View deployment

vercel bot deployed to Preview – entropy-debugger June 13, 2025 17:43 View deployment

vercel bot deployed to Preview – api-reference June 13, 2025 17:43 View deployment

vercel bot deployed to Preview – developer-hub June 13, 2025 17:46 View deployment

vercel bot deployed to Preview – staking June 13, 2025 17:46 View deployment

vercel bot deployed to Preview – insights June 13, 2025 17:46 View deployment

bump

dfda569

vercel bot deployed to Preview – developer-hub June 13, 2025 18:10 View deployment

vercel bot deployed to Preview – api-reference June 13, 2025 18:10 View deployment

vercel bot deployed to Preview – staking June 13, 2025 18:10 View deployment

vercel bot deployed to Preview – insights June 13, 2025 18:10 View deployment

vercel bot deployed to Preview – component-library June 13, 2025 18:11 View deployment

m30m merged commit 78a1148 into main Jun 13, 2025
8 of 11 checks passed

m30m deleted the branch-980240D3 branch June 13, 2025 18:11

vercel bot deployed to Preview – entropy-explorer June 13, 2025 18:12 View deployment

vercel bot deployed to Preview – entropy-debugger June 13, 2025 18:12 View deployment

vercel bot deployed to Preview – proposals June 13, 2025 18:13 View deployment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(fortuna): better retry mechanism #2780

feat(fortuna): better retry mechanism #2780

Uh oh!

m30m commented Jun 12, 2025

Uh oh!

vercel bot commented Jun 12, 2025 •

edited

Loading

Uh oh!

m30m Jun 12, 2025

Uh oh!

jayantk left a comment

Uh oh!

jayantk Jun 13, 2025

Uh oh!

m30m Jun 13, 2025

Uh oh!

jayantk Jun 13, 2025

Uh oh!

jayantk Jun 13, 2025

Uh oh!

m30m Jun 13, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat(fortuna): better retry mechanism #2780

feat(fortuna): better retry mechanism #2780

Uh oh!

Conversation

m30m commented Jun 12, 2025

Summary

Rationale

How has this been tested?

Uh oh!

vercel bot commented Jun 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

m30m Jun 12, 2025

Choose a reason for hiding this comment

Uh oh!

jayantk left a comment

Choose a reason for hiding this comment

Uh oh!

jayantk Jun 13, 2025

Choose a reason for hiding this comment

Uh oh!

m30m Jun 13, 2025

Choose a reason for hiding this comment

Uh oh!

jayantk Jun 13, 2025

Choose a reason for hiding this comment

Uh oh!

jayantk Jun 13, 2025

Choose a reason for hiding this comment

Uh oh!

m30m Jun 13, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vercel bot commented Jun 12, 2025 •

edited

Loading