# airflow-provider-smart-retry
An Apache Airflow provider that uses LLMs to make intelligent retry decisions when tasks fail.
pip install airflow-provider-smart-retry## The Problem
Airflow's built-in retry mechanism is static β it waits the same amount of time and retries blindly regardless of the error type. This leads to:
- Rate limit errors being retried too fast
- Auth errors being retried pointlessly
- Network errors not being retried fast enough
## The Solution
LLMSmartRetryOperator analyzes the error log using a local LLM (via Ollama) and decides:
- **Should we retry at all?** (auth errors β no)
- **How long should we wait?** (rate limits β 60s, network β 0s)
- **What type of error is this?** (rate_limit / network / auth / data_schema / unknown)
## How It Works
Task Fails β Extract full traceback β Send to local Ollama LLM β βββββββββββββββββββββββββββββββββββββββ β Error Classification β β β β rate_limit β wait 60s, retry 5x β β network β wait 15s, retry 4x β β auth β fail immediately β β β data_schemaβ fail immediately β β β unknown β wait 30s, retry 3x β βββββββββββββββββββββββββββββββββββββββ β XComβa classification bilgisi push edilir β Airflow UIβdan izlenebilir
All LLM inference runs locally via Ollama.
Your error logs never leave your infrastructure. π
Supported models: llama3.2, mistral, phi3, gemma2