-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
Description
Implement retry logic in the Ollama tooling generation to handle transient failures and improve reliability.
Problem
Ollama tool generation can fail due to:
- Network timeouts
- Model loading delays
- Temporary service unavailability
- Rate limiting
- Resource constraints
These failures can cause generation tasks to fail completely instead of retrying.
Implementation
Retry Strategy
- Exponential backoff - Wait longer between each retry
- Max retries - Configurable limit (e.g., 3-5 attempts)
- Timeout handling - Increase timeout on retries
- Error classification - Only retry on transient errors (not validation errors)
Configuration
OLLAMA_RETRY_CONFIG = {
"max_retries": 3,
"initial_delay": 1, # seconds
"max_delay": 30,
"exponential_base": 2,
"retry_on": ["timeout", "connection_error", "503", "502"]
}Tasks
- Identify all Ollama API calls in tool generation code
- Implement retry decorator/wrapper with exponential backoff
- Add configurable retry parameters (max attempts, delays)
- Classify which errors should trigger retry vs. fail fast
- Add logging for retry attempts
- Add metrics/monitoring for retry frequency
- Test with simulated failures
- Document retry behavior
Retry Logic Considerations
- Don't retry on - Validation errors, auth failures, malformed requests
- Do retry on - Timeouts, 5xx errors, connection failures, rate limits
- Fallback - Consider fallback to different model if all retries fail
Acceptance Criteria
- Transient Ollama failures automatically retry with backoff
- Non-retriable errors fail immediately
- Retry attempts are logged
- Configurable retry parameters
- Tool generation success rate improves
- Clear error messages when all retries exhausted
Metadata
Metadata
Assignees
Labels
No labels