Skip to content

Conversation

AnatoliB
Copy link
Owner

@AnatoliB AnatoliB commented Sep 11, 2025

Allow specifying RetryOptions for the model invocation activity and tool activities. Usage samples:

Model invocation activity

@app.durable_openai_agent_orchestrator(model_retry_options=df.RetryOptions(2000, 5))

Tool activity

tools=[
    context.activity_as_tool(get_weather, retry_options=df.RetryOptions(1000, 3))
]

The default retry options for both cases is:

RetryOptions(
    first_retry_interval_in_milliseconds=2000,
    max_number_of_attempts=5
)

This PR also refactors the task tracking and yielding trickery to centralize it in a unit-testable TaskTracker class.

Limitations and concerns

  1. If a model activity invocation eventually fails even after retries, the entire orchestration fails: there is no way to catch this as an exception in the orchestrator code. This is related to the special yielding implementation that we had to do to interrupt agent code. I know exactly why this is happening, and I have some ideas on how to fix this, but none of them is trivial.
  2. On errors, we receive a response from the model potentially containing retry hints, indicating whether this is a transient condition or not, when to retry, etc. The regular durable RetryOptions-based retries cannot take advantage of that. Later, we may want to make these retries more intelligent. This may also be the best way to resolve the previous issue.

@AnatoliB AnatoliB marked this pull request as ready for review September 11, 2025 16:40
@AnatoliB AnatoliB merged commit 1b3ac4c into durable-openai-agent Sep 12, 2025
4 checks passed
@AnatoliB AnatoliB deleted the anatolib/retry-pr branch September 16, 2025 18:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants