Skip to content

Proposal: Implement a Retry Mechanism in Aggkit-Prover for errNoProofBuiltYet #251

@vcastellm

Description

@vcastellm

Proposal: Implement a Retry Mechanism in Aggkit-Prover for errNoProofBuiltYet

Currently, when the aggkit-prover returns an errNoProofBuiltYet error (e.g., "no consecutive span proofs found"), the system waits until the next epoch to request a new proof. This can lead to significant delays, potentially up to an hour or more, which is inefficient and undesirable.

We propose implementing an immediate retry mechanism within the aggkit-prover or AggSender specifically for the errNoProofBuiltYet error. This mechanism should allow for a more prompt re-request of the aggregation proof without having to wait for the next epoch event.

Justification

The current behavior of waiting for the next epoch after an errNoProofBuiltYet error introduces unnecessary latency. If a proof fails to build due to a transient issue or data availability, waiting for a prolonged period before the next attempt can negatively impact the overall performance and responsiveness of the aggregation layer and the system UX in general. A dedicated retry mechanism for this specific error would ensure that the system can quickly attempt to generate the proof again, improving efficiency and reducing delays.

Proposed Changes

  1. Introduce a retry mechanism: Implement a retry logic in the AggSender or aggkit-prover that specifically handles the errNoProofBuiltYet error.
  2. Configurable retry parameters: Define configurable parameters for this retry mechanism, such as:
    • Retry attempts: The maximum number of times to retry requesting the proof.
    • Retry interval: The time delay between retry attempts (e.g., exponential backoff).
  3. Error differentiation: Ensure that this retry mechanism is distinct from existing gRPC-level retries for connection issues (e.g., Unavailable, Aborted, ResourceExhausted). This new retry would specifically target the business logic error of not finding consecutive span proofs.

Expected Benefits

  • Reduced latency: Faster recovery from instances where a proof fails to build.
  • Improved efficiency: Minimize the time spent waiting for the next epoch for re-attempts.
  • Enhanced system responsiveness: Ensure that the aggregation layer can more quickly produce necessary proofs.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions