Skip to content

ertancelik/airflow-provider-smart-retry

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

6 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

# airflow-provider-smart-retry

An Apache Airflow provider that uses LLMs to make intelligent retry decisions when tasks fail.

PyPI version

Installation

pip install airflow-provider-smart-retry

## The Problem

Airflow's built-in retry mechanism is static β€” it waits the same amount of time and retries blindly regardless of the error type. This leads to:

- Rate limit errors being retried too fast

- Auth errors being retried pointlessly

- Network errors not being retried fast enough

## The Solution

LLMSmartRetryOperator analyzes the error log using a local LLM (via Ollama) and decides:

- **Should we retry at all?** (auth errors β†’ no)

- **How long should we wait?** (rate limits β†’ 60s, network β†’ 0s)

- **What type of error is this?** (rate_limit / network / auth / data_schema / unknown)

## How It Works

Task Fails ↓ Extract full traceback ↓ Send to local Ollama LLM ↓ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Error Classification β”‚ β”‚ β”‚ β”‚ rate_limit β†’ wait 60s, retry 5x β”‚ β”‚ network β†’ wait 15s, retry 4x β”‚ β”‚ auth β†’ fail immediately βœ— β”‚ β”‚ data_schemaβ†’ fail immediately βœ— β”‚ β”‚ unknown β†’ wait 30s, retry 3x β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ ↓ XCom’a classification bilgisi push edilir ↓ Airflow UI’dan izlenebilir

Privacy & Security

All LLM inference runs locally via Ollama.
Your error logs never leave your infrastructure. πŸ”’

Supported models: llama3.2, mistral, phi3, gemma2

About

πŸ€– LLM-powered intelligent retry operator for Apache Airflow 3.x using local Ollama models

Topics

Resources

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Languages